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H The present application includes a Sequence Listing 
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|jj a single file named pto_PB0177.txt, having 703 kilobytes, last 

20 modified on January 17, 2002 and recorded January 24, 2002. 

The Sequence Listing contained in said file on said disc is 

incorporated herein by reference in its entirety. 



FIELD OF THE INVENTION 

25 

The present invention relates to novel human testis 
expressed patched like protein, including two isoforms. More 
specifically, the invention provides isolated nucleic acid 
molecules encoding human testis expressed patched like protein, 
30 fragments thereof, vectors and host cells comprising isolated 
nucleic acid molecules encoding human testis expressed patched 
like protein, human testis expressed patched like protein 
polypeptides, antibodies, transgenic cells and non-human 

1 



organisms, and diagnostic, therapeutic, and investigational 
methods of using the same. 

BACKGROUND OF THE INVENTION 

Sterols are key components of all cell membranes in 
eukaryotes. Cholesterol, in particular, plays an important 
role in regulating the properties of cell membranes. It 
rigidif ies the fluid membrane by occupying the spaces between 
the saturated hydrocarbon chains of the sphingolipids, thus 
reducing passive permeability and increasing the mechanical 
durability of the lipid bilayer. Simons and Ikonens, Science 
290:1721-1726 (2000). Cholesterol is also thought to increase 
post-Golgi protein sorting by regulating the thickness of the 
lipid bilayer. Simons and Ikonens, Science 290:1721-1726 
(2000) . It also functions in lipid raft assembly and function. 
These dynamic assemblies are involved in sorting and 
distributing lipids and proteins to the cell surface, where 
they play an important role in signal transduction and in 
generating cell surface polarity. Simons and Ikones, Nature 
387:569-572 (1997). It is now widely accepted that "not only 
the total cellular level of cholesterol but also its 
distribution between membranes and within a given membrane are 
tightly regulated". Simons and Ikonens, Science 290:1721-1726 
(2000) . 

The dynamic cholesterol balance between different 
cellular compartments is achieved through several important 
proteins. These include the sterol -regulated enzyme 3-hydroxy- 
3-methylglutaryl (HMG) CoA reductase, as well as SREBP cleavage 
activating protein (SCAP) , a membrane -bound glycoprotein that 
regulates the proteolytic activation of sterol regulatory 
element binding proteins (SREBPs) . Hua et al . , Cell 87:415-426 
(1996); Nohturfft et al . , Proc . Natl, Acad. Sci . USA 95:12848- 



12 853 (1998) . A common domain has been identified in these two 
proteins and named sterol -sensing domain (SSD) . Similar 
sequences have been found in several other proteins, suggesting 
that these proteins also play a role in cholesterol -related 
5 cellular functions. These proteins include the Niemann Pick- 
type CI protein necessary for export of cholesterol from 
lysosomes (Stacie et al . , Science 277:232-235 (1997)); Patched, 
a receptor for Hedgehog (Tabin and McMahon, Trends Cell Biol. 
7:442-446 (1997)); a Patched-related protein implicated in the 

10 hereditary renal cell carcinoma (TRC8; Gemmill et al . , Proc. 
Natl. Acad. Sci. USA 95:9572-9577 (1998)); and Dispatched, 
which is critical to the release of hedgehog (Burke, et al . , 
Cell 99:803-815 (1999)). Sterol - sensing domain is indispensable 
for functions of the above proteins. For example, mutations in 

15 SSD of SCAP result in the disruption of cholesterol 

homeostasis. Nohturfft et al . , Proc. Natl. Acad. Sci. USA 
95:12848-12853 (1998). Recent evidence suggests that the 
sterol -sensing domain in Patched is critical in regulating 
vesicular trafficking of its receptor partner Smoothened. 

20 Strutt et al . , Curr. Biol. 11:608-613 (2001); Martin et al . , 
Curr. Biol. 11:601-607 (2001). 

Hedgehog, the Patched ligand, is the only protein known to 
have a covalently attached cholesterol moiety. Porter et al,, 
Science 274:255-259 (1996). Three hedgehog homologues have been 

25 identified in mammals: sonic hedgehog, Indian hedgehog, and 

desert hedgehog. Tabin and McMahon, Trends Cell Biol. 7:442-446 
(1997) . Function of the hedgehog signaling pathway has been 
studied extensively. It has been shown that Patched is a 
twelve transmembrane protein and forms complex with Smoothened, 

30 a seven transmembrane protein, which acts as the signaling 
component of the pathway. Without hedgehog binding, Patched 
inhibits Smoothened signaling thus ensures that the signaling 
is tightly controlled by ligand regulation. Kalderon, Cell 



103:371-374 (2000). Mutations in sonic hedgehog, Patched, and 
smoothened have all been shown to cause tumors in skin and 
brain, among other tissues. Wicking et a 1 . , Oncogene 18:7844- 
7851 (1999) . For example, mutations in the Patched gene result 
5 in basal cell carcinoma thus Patched is a tumor suppressor 
gene. Wicking et al . , Oncogene 18:7844-7851 (1999). Desert 
hedgehog is expressed in testis and has been suggested to be 
the ligand of Patched-2. Carpenter et al . , Proc. Natl. Acad. 
Sci. USA 95:13630-13634 (1998). Although Patched-2 has yet to 
J« 10 be demonstrated as a tumor suppressor gene, its involvement in 
P germ cell development suggests that its dysfunction may cause 

m 

f»? malformation of germ cells. 



protein functioning in the hedgehog signaling pathway, 



4 Given the fact that Patched is a transmembrane 

m 
01 

s 15 mutations of which cause a variety of tumors, it has potential 

O 

£[ therapeutic as well as diagnostic roles in cancer. There is a 

W need to identify and to characterize additional Patched like 

u 

□ protein. 



20 SUMMARY OF THE INVENTION 



The present invention solves these and other needs in 
the art by providing isolated nucleic acids that encode human 
testis expressed Patched like protein (HTPL) , including two 

25 isoforms (HTPL-L for the long form and HTPL-S for the short 
form) , and fragments thereof . 

In other aspects, the invention provides vectors for 
propagating and expressing the nucleic acids of the present 
invention, host cells comprising the nucleic acids and vectors 

30 of the present invention, proteins, protein fragments, and 
protein fusions of HTPL, and antibodies thereto. 
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The invention further provides pharmaceutical 
formulations of the nucleic acids, proteins, and antibodies of 
the present invention. 

In other aspects, the invention provides transgenic 
cells and non-human organisms comprising HTPL nucleic acids, 
and transgenic cells and non-human organisms with targeted 
disruption of the endogenous orthologue of the HTPL. 

The invention additionally provides diagnostic, 
investigational, and therapeutic methods based on the HTPL 
nucleic acids, proteins, and antibodies of the present 
invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The above and other objects and advantages of the 
present invention will be apparent upon consideration of the 
following detailed description taken in conjunction with the 
accompanying drawings, in which like characters refer to like 
parts throughout, and in which: 

FIG. 1 (A) schematizes the protein domain structure 
of HTPL-L and HTPL-S, and FIG. 1 (B) shows the alignment of the 
Patched motif of HTPL-L with that of other Patched domain 
containing proteins; 

FIG. 2 is a map showing the genomic structure of HTPL 
encoded at chromosome 10pl2.1; 

FIG. 3 presents the nucleotide and predicted amino 
acid sequences of HTPL-L; 

FIG. 4 presents the nucleotide and predicted amino 
acid sequences of HTPL-S; and 

FIG. 5 presents the expression analysis of HTPL by 

RT-PCR. 



DETAILED DESCRIPTION OF THE INVENTION 



Mining the sequence of the human genome for novel 
human genes, the present inventors have identified HTPL, 
5 including two isoforms, a testis expressed Patched like 
transmembrane tumor suppressor involved in germ cell 
development . 

The newly isolated HTPL gene has two isoforms, with a 
few single base pair differences between the two (FIG. 3 and 

10 FIG. 4) . One of the single base pair changes in HTPL-S 

introduces a premature stop codon in HTPL-S (S for short) 
compared to HTPL-L (L for long) . HTPL shares an overall 
structural organization with the Patched protein. The shared 
structural features strongly imply that HTPL plays a role 

15 similar to that of Patched, in the hedgehog signaling pathway, 
and is a potential tumor suppressor. 

Like Patched, HTPL-L contains a Patched domain 
(http://pfam.wustl.edu/hmmsearch.shtml), a Sterol -sensing 
domain (SSD, http://motif.genome.ad.jp/) and twelve 

20 transmembrane domains (http://smart.embl- 

heidelberg.de/smart/show_motifs.pl) . The Patched domain in 
HTPL-L covers amino acid sequences 166 - 952 of HTPL-L. The SSD 
domain in HTPL-L covers amino acid sequences 383 - 540 of HTPL- 
L. The presence of these domains in HTPL-L suggest that HTPL-L, 

25 like Patched and other Patched domain containing proteins, is 
involved in the Hedgehog signaling pathway (see background 
section) . Because of the presence of the premature stop codon, 
HTPL-S contains a partial Patched domain, a complete Sterol - 
sensing motif and seven transmembrane domains. The presence of 

30 these domains in HTPL-S suggests that HTPL-S is also involved 
in the Hedgehog signaling pathway. 

Other signatures of the newly isolated HTPL proteins 
were identified by searching the PROSITE database 
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(http://ww.expasy.ch/tools/scnpsitl.html), and the list below 
is for both HTPL-L and HTPL-S unless specified otherwise. These 
include seven N-glycosylation sites (192 - 195, 275 - 278, 279 

- 282, 530 - 533, 678 - 681, 692 - 695 and 737 - 740), one 

5 cAMP- and cGMP- dependent protein kinase phosphorylation site 
(201 - 204) , seven protein kinase C phosphorylation sites (194 

- 196, 200 - 202, 508 - 510, 561 - 563, 662 - 664, 746 - 748, 
and 759 - 761; plus one for HTPL-L at 800 - 802), twelve Casein 
kinase II phosphorylation sites (19 - 22, 36 - 39, 62 - 65, 79 

% 10 - 82, 190 - 193, 215 - 218, 219 - 222, 225 - 228, 230 - 233, 
tji 572 - 575, 597 - 600, and 740 - 743), two tyrosine kinase 

JI1 phosphorylation site (32 9 - 335, and 681 - 688; plus one for 

SJ HTPL-L at 887 - 893), four N-myristoylation sites (307 - 312, 

f[{ 418 - 4223, 504 - 509, and 535 - 540; plus one for HTPL-L at 

15 935 - 940), and a single amidation site at 541 -544. 

FIG. 2 shows the genomic organization of HTPL. 

Ul At the top is shown the bacterial artificial 

p 

H chromosome (BAC) , with GenBank accession number (AC005875 . 2) , 

that span the HTPL locus. The genome -derived single-exon probe 
20 first used to demonstrate expression from this locus is shown 
below the BACs and labeled "SOO" . The 500 bp probe includes 
sequence drawn solely from exon four. 

As shown in FIG. 2, two HTPL isoforms have been 
identified. Both isoforms contain four exons, with a few single 
25 base pair differences between them (FIG. 3 and FIG. 4) . The 

longer form, HTPL-L, encodes a protein of 954 amino acids that 
has a predicted molecular weight, prior to any post- 
translational modification, of 107.6 kD. One of the single base 
pair changes in HTPL-S introduces a premature stop codon at 
30 position 2379 of the HTPL-S cDNA. HTPL-S, therefore, encodes a 
shorter protein of 767 amino acids that has a predicted 
molecular weight, prior to any post- translational modification, 
of 86.9 kD. Both cDNA clones appear full lengths, with the open 



reading frame starting with a methionine and terminating with a 
stop codon. 

As further discussed in the examples herein, 
expression of HTPL was assessed using hybridization to genome- 
derived single exon microarrays. Microarray analysis of the 
fourth exon showed low level expression in all tissues tested, 
including adrenal, adult liver, bone marrow, brain, fetal 
liver, kidney, lung, placenta and prostate. This was confirmed 
by RT-PCR. RT-PCR also detected strong expression in testis and 
weak expression in skeletal muscle and colon (see Example 3) . 

As more fully described below, the present invention 
provides isolated nucleic acids that encode HTPL, including two 
isoforms, and fragments thereof. The invention further 
provides vectors for propagation and expression of the nucleic 
acids of the present invention, host cells comprising the 
nucleic acids and vectors of the present invention, proteins, 
protein fragments, and protein fusions of the present 
invention, and antibodies specific for all or any one of the 
isoforms. The invention provides pharmaceutical formulations 
of the nucleic acids, proteins, and antibodies of the present 
invention. The invention further provides transgenic cells and 
non-human organisms comprising human HTPL nucleic acids, and 
transgenic cells and non-human organisms with targeted 
disruption of the endogenous orthologue of the human HTPL. The 
invention additionally provides diagnostic, investigational, 
and therapeutic methods based on the HTPL nucleic acids, 
proteins, and antibodies of the present invention. 

DEFINITIONS 
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Unless defined otherwise, all technical and 
scientific terms used herein have the meaning commonly 
understood by one of ordinary skill in the art to which this 
invention belongs. 

As used herein, "nucleic acid" (synonymously, 
"polynucleotide") includes polynucleotides having natural 
nucleotides in native 5'-3' phosphodiester linkage — e.g., DNA 
or RNA — as well as polynucleotides that have nonnatural 
nucleotide analogues, nonnative internucleoside bonds, or both, 
so long as the nonnatural polynucleotide is capable of 
sequence-discriminating basepairing under experimentally 
desired conditions. Unless otherwise specified, the term 
"nucleic acid" includes any topological conformation; the term 
thus explicitly comprehends single- stranded, double -stranded, 
partially duplexed, triplexed, hairpinned, circular, and 
padlocked conformations. 

As used herein, an "isolated nucleic acid" is a 
nucleic acid molecule that exists in a physical form that is 
nonidentical to any nucleic acid molecule of identical sequence 
as found in nature; "isolated" does not require, although it 
does not prohibit, that the nucleic acid so described has 
itself been physically removed from its native environment. 

For example, a nucleic acid can be said to be 
"isolated" when it includes nucleotides and/or internucleoside 
bonds not found in nature. When instead composed of natural 
nucleosides in phosphodiester linkage, a nucleic acid can be 
said to be "isolated" when it exists at a purity not found in 
nature, where purity can be adjudged with respect to the 
presence of nucleic acids of other sequence, with respect to 
the presence of proteins, with respect to the presence of 
lipids, or with respect the presence of any other component of 
a biological cell, or when the nucleic acid lacks sequence that 
flanks an otherwise identical sequence in an organism's genome, 



or when the nucleic acid possesses sequence not identically- 
present in nature. 

As so defined, "isolated nucleic acid" includes 
nucleic acids integrated into a host cell chromosome at a 
heterologous site, recombinant fusions of a native fragment to 
a heterologous sequence, recombinant vectors present as 
episomes or as integrated into a host cell chromosome. 

As used herein, an isolated nucleic acid "encodes" a 
reference polypeptide when at least a portion of the nucleic 
acid, or its complement, can be directly translated to provide 
the amino acid sequence of the reterence polypeptide, or when 
the isolated nucleic acid can be used, alone or as part of an 
expression vector, to express the reference polypeptide in 
vitro, in a prokaryotic host cell, or in a eukaryotic host 
cell. 

As used herein, the term "exon" refers to a nucleic 
acid sequence found in genomic DNA that is bioinf ormatically 
predicted and/or experimentally confirmed to contribute 
contiguous sequence to a mature mRNA transcript. 

As used herein, the phrase "open reading frame" and 
the equivalent acronym "ORF" refer to that portion of a 
transcript -derived nucleic acid that can be translated in its 
entirety into a sequence of contiguous amino acids. As so 
defined, an ORF has length, measured in nucleotides, exactly 
divisible by 3 . As so defined, an ORF need not encode the 
entirety of a natural protein. 

As used herein, the phrase "ORF-encoded peptide" 
refers to the predicted or actual translation of an ORF. 

As used herein, the phrase "degenerate variant" of a 
reference nucleic acid sequence intends all nucleic acid 
sequences that can be directly translated, using the standard 
genetic code, to provide an amino acid sequence identical to 
that translated from the reference nucleic acid sequence. 

10 



As used herein, the term "mi cr oar ray" and the 
equivalent phrase "nucleic acid microarray" refer to a 
substrate -bound collection of plural nucleic acids , 
hybridization to each of the plurality of bound nucleic acids 
5 being separately detectable. The substrate can be solid or 
porous, planar or non-planar, unitary or distributed. 

As so defined, the term "microarray" and phrase 
"nucleic acid microarray" include all the devices so called in 
Schena (ed.), DNA Microarrays: A Practical Approach (Practical 
\Z 10 Approach Series ) , Oxford University Press (1999) (ISBN: 
Q 0199637768); .Mature Genet, 21 (1) (suppl ) : 1 - 60 (1999); and 

fi) 

V Schena (ed.), Microarray Biochip: Tools and Technology , Eaton 

Si Publishing Company/BioTechniques Books Division (2000) (ISBN: 

j ?| 

f n 1881299376) , the disclosures of which are incorporated herein 

s 15 by reference in their entireties. 

r\ 

r\ As so defined, the term "microarray" and phrase 

"nucleic acid microarray" also include substrate-bound 
collections of plural nucleic acids in which the plurality of 

us 

FU nucleic acids are distributably disposed on a plurality of 

20 beads, rather than on a unitary planar substrate, as is 

described, inter alia, in Brenner et al . , Proc. Natl. Acad. 
Sci. USA 97 (4) :166501670 (2000), the disclosure of which is 
incorporated herein by reference in its entirety; in such case, 
the term "microarray" and phrase "nucleic acid microarray" 
25 refer to the plurality of beads in aggregate. 

As used herein with respect to solution phase 
hybridization, the term "probe", or equivalently , "nucleic acid 
probe" or "hybridization probe", refers to an isolated nucleic 
acid of known sequence that is, or is intended to be, 
30 detectably labeled. As used herein with respect to a nucleic 
acid microarray, the term "probe" (or equivalently "nucleic 
acid probe" or "hybridization probe") refers to the isolated 
nucleic acid that is, or is intended to be, bound to the 

11 



substrate. In either such context, the term "target" refers to 
nucleic acid intended to be bound to probe by sequence 
complementarity . 

As used herein, the expression "probe comprising SEQ 
ID NO:X" , and variants thereof, intends a nucleic acid probe, 
at least a portion of which probe has either (i) the sequence 
directly as given in the referenced SEQ ID NO:X, or (ii) a 
sequence complementary to the sequence as given in the 
referenced SEQ ID NO:X, the choice as between sequence directly 
as given and complement thereof dictated by the requirement 
that the probe be complementary to the desired target . 

As used herein, the phrases "expression of a probe" 
and "expression of an isolated nucleic acid" and their 
linguistic equivalents intend that the probe or, (respectively, 
the isolated nucleic acid) , or a probe (or, respectively, 
isolated nucleic acid) complementary in sequence thereto, can 
hybridize detectably under high stringency conditions to a 
sample of nucleic acids that derive from mRNA transcripts from 
a given source. For example, and by way of illustration only, 
expression of a probe in "liver" means that the probe can 
hybridize detectably under high stringency conditions to a 
sample of nucleic acids that derive from mRNA obtained from 
liver. 

As used herein, "a single exon probe" comprises at 
least part of an exon ("reference exon") and can hybridize 
detectably under high stringency conditions to 
transcript -derived nucleic acids that include the reference 
exon. The single exon probe will not, however, hybridize 
detectably under high stringency conditions to nucleic acids 
that lack the reference exon and that consist of one or more 
exons that are found adjacent to the reference exon in the 
genome . 
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For purposes herein, "high stringency conditions" are 

defined for solution phase hybridization as aqueous 
hybridization (i.e., free of formamide) in 6X SSC (where 20X 
SSC contains 3.0 M NaCl and 0.3 M sodium citrate), 1% SDS at 
65°C for at least 8 hours, followed by one or more washes in 
0.2X SSC, 0.1% SDS at 65°C. "Moderate stringency conditions" 
are defined for solution phase hybridization as aqueous 
hybridization (i.e., free of formamide) in 6X SSC, 1% SDS at 
65°C for at least 8 hours, followed by one or more washes in 2x 
SSC, 0.1% SDS at room temperature. 

For microarray-based hybridization, standard "high 
stringency conditions" are defined as hybridization in 50% 
formamide, 5X SSC, 0.2 ng/nl poly(dA), 0.2 ^g/^1 human cotl 
DNA, and 0.5% SDS, in a humid oven at 42°C overnight, followed 
by successive washes of the microarray in IX SSC, 0.2% SDS at 
55°C for 5 minutes, and then 0 . IX SSC, 0.2% SDS , at 55°C for 20 
minutes. For microarray-based hybridization, "moderate 
stringency conditions", suitable for cross-hybridization to 
mRNA encoding structurally- and functionally-related proteins, 
are defined to be the same as those for high stringency 
conditions but with reduction in temperature for hybridization 
and washing to room temperature (approximately 25°C) . 

As used herein, the terms "protein", "polypeptide", 
and "peptide" are used interchangeably to refer to a naturally- 
occurring or synthetic polymer of amino acid monomers 
(residues) , irrespective of length, where amino acid monomer 
here includes naturally-occurring amino acids, naturally- 
occurring amino acid structural variants, and synthetic non- 
naturally occurring analogs that are capable of participating 
in peptide bonds. The terms "protein", "polypeptide", and 
"peptide" explicitly permits of post-translational and post- 
synthetic modifications, such as glycosylation . 
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The term "oligopeptide" herein denotes a protein, 
polypeptide, or peptide having 25 or fewer monomeric subunits. 

The phrases "isolated protein", "isolated 
polypeptide", "isolated peptide" and "isolated oligopeptide" 

refer to a protein (or respectively to a polypeptide, peptide, 
or oligopeptide) that is nonidentical to any protein molecule 
of identical amino acid sequence as found in nature; "isolated" 
does not require, although it does not prohibit, that the 
protein so described has itself been physically removed from 
its native environment. 

For example, a protein can be said to be "isolated" 
when it includes amino acid analogues or derivatives not found 
in nature, or includes linkages other than standard peptide 
bonds . 

When instead composed entirely of natural amino acids 
linked by peptide bonds, a protein can be said to be "isolated" 
when it exists at a purity not found in nature — where purity 
can be adjudged with respect to the presence of proteins of 
other sequence, with respect to the presence of non-protein 
compounds, such as nucleic acids, lipids, or other components 
of a biological cell, or when it exists in a composition not 
found in nature, such as in a host cell that does not naturally 
express that protein. 

A "purified protein" (equally, a purified 
polypeptide, peptide, or oligopeptide) is an isolated protein, 
as above described, present at a concentration of at least 95%, 
as measured on a weight basis with respect to total protein in 
a composition. A "substantially purified protein" (equally, a 
substantially purified polypeptide, peptide, or oligopeptide) 
is an isolated protein, as above described, present at a 
concentration of at least 70%, as measured on a weight basis 
with respect to total protein in a composition. 
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As used herein, the phrase "prot in isoforms" refers 
to a plurality of proteins having nonidentical primary amino 
acid sequence but that share amino acid sequence encoded by at 
least one common exon. 

As used herein, the phrase "alternative splicing" and 
its linguistic equivalents includes all types of RNA processing 
that lead to expression of plural protein isoforms from a 
single gene; accordingly, the phrase "splice variant(s)" and 
its linguistic equivalents embraces mRNAs transcribed from a 
given gene that, however processed, collectively encode plural 
protein isoforms. For example, and by way of illustration 
only, splice variants can include exon insertions, exon 
extensions, exon truncations, exon deletions, alternatives in 
the 5' untranslated region ("5' UT" ) and alternatives in the 3" 
untranslated region ("3' UT") . Such 3' alternatives include, 
for example, differences in the site of RNA transcript cleavage 
and site of poly (A) addition. See, e.g., Gautheret et al . , 
Genome Res. 8:524-530 (1998). 

As used herein, "orthologues" are separate 
occurrences of the same gene in multiple species. The separate 
occurrences have similar, albeit nonidentical, amino acid 
sequences, the degree of sequence similarity depending, in 
part, upon the evolutionary distance of the species from a 
common ancestor having the same gene . 

As used herein, the term "paralogues" indicates 
separate occurrences of a gene in one species . The separate 
occurrences have similar, albeit nonidentical, amino acid 
sequences, the degree of sequence similarity depending, in 
part, upon the evolutionary distance from the gene duplication 
event giving rise to the separate occurrences. 

As used herein, the term "homologues" is generic to 
"orthologues" and "paralogues". 
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As used herein, the term "antibody" refers to a 
polypeptide, at least a portion of which is encoded by at least 
one immunoglobulin gene, or fragment thereof, and that can bind 
specifically to a desired target molecule. The term includes 
naturally-occurring forms, as well as fragments and 
derivatives . 

Fragments within the scope of the term "antibody" 
include those produced by digestion with various proteases, 
those produced by chemical cleavage and/or chemical 
dissociation, and those produced recombinantly, so long as the 
fragment remains capable of specific binding to a target 
molecule. Among such fragments are Fab, Fab', Fv, F(ab)' 2/ and 
single chain Fv (scFv) fragments. 

Derivatives within the scope of the term include 
antibodies (or fragments thereof) that have been modified in 
sequence, but remain capable of specific binding to a target 
molecule, including: interspecies chimeric and humanized 
antibodies; antibody fusions; heteromeric antibody complexes 
and antibody fusions, such as diabodies (bispecific 
antibodies) , single-chain diabodies, and intrabodies (see, 
e.g., Marasco (ed.), Intracellular Antibodies: Research and 

Disease Applications , Springer-Verlag New York, Inc. (1998) 
(ISBN: 3540641513), the disclosure of which is incorporated 

herein by reference in its entirety) . 

As used herein, antibodies can be produced by any 

known technique, including harvest from cell culture of native 

B lymphocytes, harvest from culture of hybridomas, recombinant 

expression systems, and phage display. 

As used herein, "antigen" refers to a ligand that can 

be bound by an antibody; an antigen need not itself be 

immunogenic. The portions of the antigen that make contact 

with the antibody are denominated "epitopes" . 
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"Specific binding" refers to the ability of two 
molecular species concurrently present in a heterogeneous 
(inhomogeneous) sample to bind to one another in preference to 
binding to other molecular species in the sample. Typically, a 
specific binding interaction will discriminate over 
adventitious binding interactions in the reaction by at least 
two-fold, more typically by at least 10-fold, often at least 
100-fold; when used to detect analyte, specific binding is 
sufficiently discriminatory when determinative of the presence 
of the analyte in a heterogeneous (inhomogeneous) sample. 
Typically, the affinity or avidity of a specific binding 
reaction is least about 10" 7 M, with specific binding reactions 
of greater specificity typically having affinity or avidity of 
at least 10' 8 M to at least about 10" 9 M. 

As used herein, "molecular binding partners" - and 
equivalently, "specific binding partners" - refer to pairs of 
molecules, typically pairs of biomolecules , that exhibit 
specific binding. Nonlimiting examples are receptor and 
ligand, antibody and antigen, and biotin to any of avidin, 
streptavidin, neutrAvidin and capt Avidin. 

The term "antisense", as used herein, refers to a 
nucleic acid molecule sufficiently complementary in sequence, 
and sufficiently long in that complementary sequence, as to 
hybridize under intracellular conditions to (i) a target mRNA 
transcript or (ii) the genomic DNA strand complementary to that 
transcribed to produce the target mRNA transcript. 

The term "portion", as used with respect to nucleic 
acids, proteins, and antibodies, is synonymous with "fragment". 

NUCLEIC ACID MOLECULES 

In a first aspect, the invention provides isolated 
nucleic acids that encode HTPL, including two isoforms, 

17 



variants having at least 65% sequence identity thereto, 
degenerate variants thereof, variants that encode HTPL proteins 
having conservative or moderately conservative substitutions, 
cross -hybridizing nucleic acids, and fragments thereof. 

FIG. 3 and FIG. 4 presents the nucleotide sequence of 
the HTPL-L and HTPL-S cDNA clones, with predicted amino acid 
translation; the sequences are further presented in the 
Sequence Listing, incorporated herein by reference in its 
entirety, in SEQ ID NOs : 1 (full length nucleotide sequence of 
human HTPL-L cDNA) , 3 (full length amino acid coding sequence 
of human HTPL-L) , 4 (full length nucleotide sequence of human 
HTPL-S cDNA) , and 6 (full length amino acid coding sequence of 
human HTPL-S) . 

Unless otherwise indicated, each nucleotide sequence 
is set forth herein as a sequence of deoxyribonucleotides . It 
is intended, however, that the given sequence be interpreted as 
would be appropriate to the polynucleotide composition: for 
example, if the isolated nucleic acid is composed of RNA, the 
given sequence intends ribonucleotides, with uridine 
substituted for thymidine. 

Unless otherwise indicated, nucleotide sequences of 
the isolated nucleic acids of the present invention were 
determined by sequencing a DNA molecule that had resulted, 
directly or indirectly, from at least one enzymatic 
polymerization reaction (e.g., reverse transcription and/or 
polymerase chain reaction) using an automated sequencer (such 
as the MegaBACE™ 1000, Amersham Biosciences, Sunnyvale, CA, 
USA) , or by reliance upon such sequence or upon genomic 
sequence prior-accessioned into a public database. Unless 
otherwise indicated, all amino acid sequences of the 
polypeptides of the present invention were predicted by 
translation from the nucleic acid sequences so determined. 
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As a consequence, any nucleic acid sequence presented 
herein may contain errors introduced by erroneous incorporation 
of nucleotides during polymerization, by erroneous base calling 
by the automated sequencer (although such sequencing errors 
5 have been minimized for the nucleic acids directly determined 
herein, unless otherwise indicated, by the sequencing of each 
of the complementary strands of a duplex DNA) , or by similar 
errors accessioned into the public database. Such errors can 
readily be identified and corrected by resequencing of the 
10 genomic locus using standard techniques. 

Single nucleotide polymorphisms (SNPs) occur 



Q 

U) frequently in eukaryotic genomes - more than 1.4 million SNPs 



have already identified in the human genome, International 



in Human Genome Sequencing Consortium, Nature 409:860 - 921 (2001) 

lii 

15 - and the sequence determined from one individual of a species 
may differ from other allelic forms present within the 
population. Additionally, small deletions and insertions, 

£3 rather than single nucleotide polymorphisms, are not uncommon 

p 

rlj in the general population, and often do not alter the function 

20 of the protein. 

Accordingly, it is an aspect of the present invention 
to provide nucleic acids not only identical in sequence to 
those described with particularity herein, but also to provide 
isolated nucleic acids at least about 65% identical in sequence 

25 to those described with particularity herein, typically at 

least about 70%, 75%, 80%, 85%, or 90% identical in sequence to 
those described with particularity herein, usefully at least 
about 91%, 92%, 93%, 94%, or 95% identical in sequence to those 
described with particularity herein, usefully at least about 

30 96%, 97%, 98%, or 99% identical in sequence to those described 
with particularity herein, and, most conservatively, at least 
about 99.5%, 99.6%, 99.7%, 99.8% and 99.9% identical in 
sequence to those described with particularity herein. These 
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sequence variants can be naturally occurring or can result from 
human intervention, as by random or directed mutagenesis. 

For purposes herein, percent identity of two nucleic 
acid sequences is determined using the procedure of Tatiana et 
5 al . , "Blast 2 sequences - a new tool for comparing protein and 
nucleotide sequences", FEMS Microbiol Lett. 174:247-250 (1999), 
which procedure is effectuated by the computer program BLAST 2 
SEQUENCES, available online at 

http://www.ncbi .nlm.nih.gov/blast/bl2seq/bl2 .html . 
10 To assess percent identity of nucleic acids, the BLASTN module 
□ of BLAST 2 SEQUENCES is used with default values of (i) reward 



m 



for a match: 1; (ii) penalty for a mismatch: -2; (iii) open gap 



Nj 5 and extension gap 2 penalties; (iv) gap X_dropoff 50 expect 

i p 

10 word size 11 filter, and both sequences are entered in their 
s 15 entireties. 



As is well known, the genetic code is degenerate, 
UJ with each amino acid except methionine translated from a 

«| plurality of codons, thus permitting a plurality of nucleic 

Pi acids of disparate sequence to encode the identical protein. 

20 As is also well known, codon choice for optimal expression 

varies from species to species. The isolated nucleic acids of 
the present invention being useful for expression of HTPL 
proteins and protein fragments, it is, therefore, another 
aspect of the present invention to provide isolated nucleic 
25 acids that encode HTPL proteins and portions thereof not only 
identical in sequence to those described with particularity 
herein, but degenerate variants thereof as well. 

As is also well known, amino acid substitutions occur 
frequently among natural allelic variants, with conservative 
30 substitutions often occasioning only de minimis change in 
protein function. 

Accordingly, it is an aspect of the present invention 
to provide nucleic acids not only identical in sequence to 

20 



those described with particularity herein, but also to provide 
isolated nucleic acids that encode HTPL, and portions thereof, 
having conservative amino acid substitutions, and also to 
provide isolated nucleic acids that encode HTPL, and portions 
thereof, having moderately conservative amino acid 
substitutions . 

Although there are a variety of metrics for calling 
conservative amino acid substitutions, based primarily on 
either observed changes among evolutionarily related proteins 
or on predicted chemical similarity, for purposes herein a 
conservative replacement is any change having a positive value 
in the PAM250 log -likelihood matrix reproduced herein below 
(see Gonnet et al., Science 256 (5062) : 1443-5 (1992)): 
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For purposes herein, a "moderately conservative" replacement is 
any change having a nonnegative value in the PAM2 50 log- 
likelihood matrix reproduced herein above. 

As is also well known in the art, relatedness of 
5 nucleic acids can also be characterized using a functional 

test, the ability of the two nucleic acids to base-pair to one 
another at defined hybridization stringencies. 

It is, therefore, another aspect of the invention to 
provide isolated nucleic acids not only identical in sequence 
h* 10 to those described with particularity herein, but also to 
P\ provide isolated nucleic acids ( "cross -hybridizing nucleic 

IP acids") that hybridize under high stringency conditions (as 

defined herein below) to all or to a portion of various of the 
isolated HTPL nucleic acids of the present invention 
15 ("reference nucleic acids"), as well as cross-hybridizing 
F- nucleic acids that hybridize under moderate stringency 

hi conditions to all or to a portion of various of the isolated 

^ HTPL nucleic acids of the present invention, 

ry Such cross -hybridizing nucleic acids are useful, 

20 inter alia, as probes for, and to drive expression of, proteins 
related to the proteins of the present invention as alternative 
isoforms, homologues, paralogues, and orthologues. 
Particularly useful orthologues are those from other primate 
species, such as chimpanzee, rhesus macaque, monkey, baboon, 
25 orangutan, and gorilla; from rodents, such as rats, mice, 
guinea pigs; from lagomorphs, such as rabbits; and from 
domestic livestock, such as cow, pig, sheep, horse, goat and 
chicken. 

For purposes herein, high stringency conditions are 
30 defined as aqueous hybridization (i.e., free of formamide) in 
6X SSC (where 20X SSC contains 3.0 M NaCl and 0.3 M sodium 
citrate), 1% SDS at 65°C for at least 8 hours, followed by one 
or more washes in 0 . 2X SSC, 0.1% SDS at 65°C. For purposes 
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herein, moderate stringency conditions are defined as aqueous 
hybridization (i.e., free of formamide) in 6X SSC, 1% SDS at 
65°C for at least 8 hours, followed by one or more washes in 2x 
SSC, 0.1% SDS at room temperature. 
5 The hybridizing portion of the reference nucleic acid 

is typically at least 15 nucleotides in length, often at least 
17 nucleotides in length. Often, however, the hybridizing 
portion of the reference nucleic acid is at least 20 
nucleotides in length, 25 nucleotides in length, and even 30 

10 nucleotides, 35 nucleotides, 40 nucleotides, and 50 nucleotides 
in length. Of course, cross -hybridizing nucleic acids that 
hybridize to a larger portion of the reference nucleic acid - 
for example, to a portion of at least 50 nt, at least 100 nt, 
at least 150 nt, 200 nt, 250 nt, 300 nt, 350 nt , 400 nt , 450 

15 nt, or 500 nt or more - or even to the entire length of the 
reference nucleic acid, are also useful. 

The hybridizing portion of the cross -hybridizing 
nucleic acid is at least 75% identical in sequence to at least 
a portion of the reference nucleic acid. Typically, the 

20 hybridizing portion of the cross-hybridizing nucleic acid is at 
least 80%, often at least 85%, 86%, 87%, 88%, 89% or even at 
least 90% identical in sequence to at least a portion of the 
reference nucleic acid. Often, the hybridizing portion of the 
cross -hybridizing nucleic acid will be at least 91%, 92%, 93%, 

25 94%, 95%, 96%, 97%, 98%, or 99% identical in sequence to at 
least a portion of the reference nucleic acid sequence. At 
times, the hybridizing portion of the cross-hybridizing nucleic 
acid will be at least 99.5% identical in sequence to at least a 
portion of the reference nucleic acid. 

30 The invention also provides fragments of various of 

the isolated nucleic acids of the present invention. 

By "fragments" of a reference nucleic acid is here 
intended isolated nucleic acids, however obtained, that have a 
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nucleotide sequence identical to a portion of the reference 
nucleic acid sequence, which portion is at least 17 nucleotides 
and less than the entirety of the reference nucleic acid. As 
so defined, "fragments" need not be obtained by physical 
5 fragmentation of the reference nucleic acid, although such 
provenance is not thereby precluded. 

In theory, an oligonucleotide of 17 nucleotides is of 
sufficient length as to occur at random less frequently than 
once in the three gigabase human genome, and thus to provide a 
U 10 nucleic acid probe that can uniquely identify the reference 

sequence in a nucleic acid mixture ot genomic complexity. As 
is well known, further specificity can be obtained by probing 
nucleic acid samples of subgenomic complexity, and/or by using 
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Lfl plural fragments as short as 17 nucleotides in length 

" 15 collectively to prime amplification of nucleic acids, as, e.g., 

p by polymerase chain reaction (PCR) . 

["'r As further described herein below, nucleic acid 

9 fragments that encode at least 6 contiguous amino acids (i.e., 

i'Ji 

fij fragments of 18 nucleotides or more) are useful in directing 

20 the expression or the synthesis of peptides that have utility 
in mapping the epitopes of the protein encoded by the reference 
nucleic acid. See, e.g., Geysen et al . , "Use of peptide 
synthesis to probe viral antigens for epitopes to a resolution 
of a single amino acid," Proc. Natl. Acad. Sci . USA 81:3998- 
25 4002 (1984); and U.S. Pat. Nos . 4,708,871 and 5,595,915, the 
disclosures of which are incorporated herein by reference in 
their entireties. 

As further described herein below, fragments that 
encode at least 8 contiguous amino acids (i.e., fragments of 24 
30 nucleotides or more) are useful in directing the expression or 
the synthesis of peptides that have utility as immunogens . 
See, e.g., Lerner, "Tapping the immunological repertoire to 
produce antibodies of predetermined specificity, 11 Nature 
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299:592-596 (1982); Shinnick et al . , "Synthetic peptide 
immunogens as vaccines," Annu. Rev. Microbiol. 37:425-46 
(1983); Sutcliffe et al . , "Antibodies that react with 
predetermined sites on proteins," Science 219:660-6 (1983), the 
5 disclosures of which are incorporated herein by reference in 
their entireties. 

The nucleic acid fragment of the present invention is 
thus at least 17 nucleotides in length, typically at least 18 
nucleotides in length, and often at least 24 nucleotides in 
10 length. Often, the nucleic acid of the present invention is at 
P least 25 nucleotides in length, and even 30 nucleotides, 35 

}[* nucleotides, 40 nucleotides, or 45 nucleotides in length. Of 

SI course, larger fragments having at least 50 nt, at least 100 

nt, at least 150 nt, 200 nt, 250 nt, 300 nt , 350 nt, 400 nt, 
15 450 nt, or 500 nt or more are also useful, and at times 
preferred. 

UJ Having been based upon the mining of genomic 

n 

sequence, rather than upon surveillance of expressed message, 
the present invention further provides isolated genome -derived 
20 nucleic acids that include portions of the HTPL gene. 

The invention particularly provides genome -derived 
single exon probes. 

As further described in commonly owned and copending 
U.S. patent application serial nos . 09/864,761, filed May 23, 
25 2001; 09/774,203, filed January 29, 2001; and 09/632,366, filed 
August 3, 2000, the disclosures of which are incorporated 
herein by reference in their entireties, "a single exon probe" 
comprises at least part of an exon ("reference exon") and can 
hybridize detect ably under high stringency conditions to 
30 transcript -derived nucleic acids that include the reference 
exon. The single exon probe will not, however, hybridize 
detectably under high stringency conditions to nucleic acids 
that lack the reference exon and instead consist of one or more 

25 
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exons that are found adjacent to the reference exon in the 
genome . 

Genome -derived single exon probes typically further 
comprise, contiguous to a first end of the exon portion, a 
first intronic and/or intergenic sequence that is identically 
contiguous to the exon in the genome. Often, the genome- 
derived single exon probe further comprises, contiguous to a 
second end of the exonic portion, a second intronic and/or 
intergenic sequence that is identically contiguous to the exon 
in the genome. 

The minimum length of genome -derived single exon 
probes is defined by the requirement that the exonic portion be 
of sufficient length to hybridize under high stringency 
conditions to transcript-derived nucleic acids. Accordingly, 
the exon portion is at least 17 nucleotides, typically at least 
18 nucleotides, 20 nucleotides, 24 nucleotides, 25 nucleotides 
or even 30, 35, 40, 45, or 50 nucleotides in length, and can 
usefully include the entirety of the exon, up to 100 nt, 150 
nt, 200 nt, 250 nt, 300 nt , 350 nt, 400 nt or even 500 nt or 
more in length. 

The maximum length of genome -de rived single exon 
probes is defined by the requirement that the probes contain 
portions of no more than one exon, that is, be unable to 
hybridize detectably under high stringency conditions to 
nucleic acids that lack the reference exon but include one or 
more exons that are found adjacent to the reference exon the 
genome . 

Given variable spacing of exons through eukaryotic 
genomes, the maximum length of single exon probes of the 
present invention is typically no more than 25 kb, often no 
more than 20 kb, 15 kb, 10 kb or 7.5 kb, or even no more than 5 
kb, 4 kb, 3 kb, or even no more than about 2.5 kb in length. 
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The genome-derived single exon probes of the present 
invention can usefully include at least a first terminal 
priming sequence not found in contiguity with the rest of the 
probe sequence in the genome, and often will contain a second 
5 terminal priming sequence not found in contiguity with the rest 
of the probe sequence in the genome. 

The present invention also provides isolated genome- 
derived nucleic acids that include nucleic acid sequence 
elements that control transcription of the HTPL gene. 
h'» 10 With a complete draft of the human genome now 

available, genomic sequences that are within the vicinity of 
the HTPL coding region (and that are additional to those 
described with particularity herein) can readily be obtained by 

Zl PCR amplification. 

in 

15 The isolated nucleic acids of the present invention 

can be composed of natural nucleotides in native 5' -3' 
UJ phosphodiester internucleoside linkage - e.g., DNA or RNA - or 

z:t can contain any or all of nonnatural nucleotide analogues, 

u 

flj nonnative internucleoside bonds, or post-synthesis 

20 modifications, either throughout the length of the nucleic acid 
or localized to one or more portions thereof. 

As is well known in the art, when the isolated 
nucleic acid is used as a hybridization probe, the range of 
such nonnatural analogues, nonnative internucleoside bonds, or 
25 post -synthesis modifications will be limited to those that 
permit sequence-discriminating basepairing of the resulting 
nucleic acid. When used to direct expression or RNA or protein 
in vitro or in vivo, the range of such nonnatural analogues, 
nonnative internucleoside bonds, or post -synthesis 
30 modifications will be limited to those that permit the nucleic 
acid to function properly as a polymerization substrate. When 
the isolated nucleic acid is used as a therapeutic agent, the 
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range of such changes will be limited to those that do not 
confer toxicity upon the isolated nucleic acid. 

For example, when desired to be used as probes, the 
isolated nucleic acids of the present invention can usefully 
include nucleotide analogues that incorporate labels that are 
directly detectable, such as radiolabels or f luorophores , or 
nucleotide analogues that incorporate labels that can be 
visualized in a subsequent reaction, such as biotin or various 
haptens . 

Common radiolabeled analogues include those labeled 
with 33 P, 32 P, and 35 S, such as a- 32 P-dATP, a- 32 P-dCTP, a- 32 P-dGTP, 
a- 32 P-dTTP, a- 32 P-3 1 dATP, a- 32 P-ATP, a- 32 P-CTP, a- 32 P-GTP, a- 32 P- 
UTP, a- 35 S-dATP, y- 35 S-GTP, y- 33 P-dATP, and the like. 

Commercially available fluorescent nucleotide 
analogues readily incorporated into the nucleic acids of the 
present invention include Cy3-dCTP, Cy3-dUTP, Cy5-dCTP, Cy3- 
dUTP (Amersham Pharmacia Biotech, Piscataway, New Jersey, USA) , 
fluorescein- 12 -dUTP , tetramethylrhodamine - 6 -dUTP , Texas 
Red®-5-dUTP, Cascade Blue®-7-dUTP, BODIPY® FL-14-dUTP, BODIPY® 
TMR-14-dUTP, BODIPY® TR-14-dUTP, Rhodamine Green™-5 -dUTP, 
Oregon Green® 488-5-dUTP, Texas Red®-12 -dUTP, BODIPY® 
630/650-14-dUTP, BODIPY® 650/665-14-dUTP, Alexa Fluor® 
488-5-dUTP, Alexa Fluor® 532-5-dUTP, Alexa Fluor® 568-5-dUTP, 
Alexa Fluor® 594-5-dUTP, Alexa Fluor® 546-14-dUTP, 
fluorescein- 12 -UTP , tetramethylrhodamine - 6 -UTP , Texas 
Red®-5-UTP, Cascade Blue®-7-UTP, BODIPY® FL- 14 -UTP, BODIPY® 
TMR-14-UTP, BODIPY® TR-14 -UTP, Rhodamine Green™-5-UTP, Alexa 
Fluor® 488-5-UTP, Alexa Fluor® 546-14-UTP (Molecular Probes, 
Inc. Eugene, OR, USA). 

Protocols are available for custom synthesis of 
nucleotides having other f luorophores . Henegariu et al . , 
"Custom Fluorescent-Nucleotide Synthesis as an Alternative 
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Method for Nucleic Acid Labeling," Nature Biotechnol . 18:345 - 
348 (2000) , the disclosure of which is incorporated herein by 
reference in its entirety. 

Haptens that are commonly conjugated to nucleotides 
for subsequent labeling include biotin (biotin- 11-dUTP, 
Molecular Probes, Inc., Eugene, OR, USA; biotin-21-UTP, biotin- 
21-dUTP, Clontech Laboratories, Inc., Palo Alto, CA, USA), 
digoxigenin (DIG-ll-dUTP, alkali labile, DIG-11-UTP, Roche 
Diagnostics Corp., Indianapolis, IN, USA), and dinitrophenyl 
(dinitrophenyl-ll-dUTP, Molecular Probes, Inc., Eugene, OR, 
USA) . 

As another example, when desired to be used for 
antisense inhibition of transcription or translation, the 
isolated nucleic acids of the present invention can usefully 
include altered, often nuclease-resistant , internucleoside 
bonds. See Hartmann et al . (eds.), Manual of Antisense 
Methodology (Perspectives in Antisense Science) , Kluwer Law 
International (1999) (ISBN: 079238539X) ; Stein et al. (eds.), 
Applied Antisense Oligonucleotide Technology , Wiley-Liss (cover 
(1998) (ISBN: 0471172790); Chadwick et al . (eds.), 
Oligonucleotides as Therapeutic Agents - Symposium No. 209 , 
John Wiley & Son Ltd (1997) (ISBN: 0471972797), the disclosures 
of which are incorporated herein by reference in their 
entireties. Such altered internucloside bonds are often 
desired also when the isolated nucleic acid of the present 
invention is to be used for targeted gene correction, Gamper et 
al., Nucl. Acids Res. 28 (21) : 4332-4339 (2000), the disclosures 
of which are incorporated herein by reference in its entirety. 

Modified oligonucleotide backbones often preferred 
when the nucleic acid is to be used for antisense purposes are, 
for example, phosphorothioates, chiral phosphorothioates , 
phosphorodithioates , phosphotriesters , 

aminoalkylphosphotriesters, methyl and other alkyl phosphonates 
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including 3 ' -alkylene phosphonates and chiral phosphonates , 
phosphinates, phosphoramidates including 3 1 -amino 
phosphoramidate and ami noa Iky lphospho rami dates , 
thionophosphoramidates , thionoalkylphosphonates , 
thionoalkylphosphotriesters, and boranophosphates having normal 
3'-5' linkages, 2' -5' linked analogs of these, and those having 
inverted polarity wherein the adjacent pairs of nucleoside 
units are linked 3 '-5' to 5' -3' or 2' -5' to 5 1 -2 ' . 
Representative U.S. patents that teach the preparation of the 
above phosphorus -containing linkages include, but are not 
limited to, U.S. Pat. Nos . 3,687,808; 4,469,863; 4,476,301; 
5,023,243; 5,177,196; 5,188,897; 5,264,423; 5,276,019; 
5,278,302; 5,286,717; 5,321,131; 5,399,676; 5,405,939; 
5,453,496; 5,455,233; 5,466,677; 5,476,925; 5,519,126; 
5,536,821; 5,541,306; 5,550,111; 5,563,253; 5,571,799; 
5,587,361; and 5,625,050, the disclosures of which are 
incorporated herein by reference in their entireties. 

Preferred modified oligonucleotide backbones for 
ant i sense use that do not include a phosphorus atom have 
backbones that are formed by short chain alkyl or cycloalkyl 
internucleoside linkages, mixed heteroatom and alkyl or 
cycloalkyl internucleoside linkages, or one or more short chain 
heteroatomic or heterocyclic internucleoside linkages. These 
include those having morpholino linkages (formed in part from 
the sugar portion of a nucleoside); siloxane backbones; 
sulfide, sulfoxide and sulfone backbones; formacetyl and 
thioformacetyl backbones; methylene formacetyl and 
thioformacetyl backbones; alkene containing backbones; 
sulfamate backbones; methyl eneimino and methylenehydrazino 
backbones; sulfonate and sulfonamide backbones; amide 
backbones; and others having mixed N, O, S and CH 2 component 
parts. Representative U.S. patents that teach the preparation 
of the above backbones include, but are not limited to, U.S. 
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Pat. Nos. 5,034,506; 5,166,315; 5,185,444; 5,214,134; 
5,216,141; 5,235,033; 5,264,562; 5,264,564; 5,405,938; 
5,434,257; 5,466,677; 5,470,967; 5,489,677; 5,541,307; 
5,561,225; 5,596,086; 5,602,240; 5,610,289; 5,602,240; 
5,608,046; 5,610,289; 5,618,704; 5,623,070; 5,663,312; 
5,633,360; 5,677,437; and 5,677,439, the disclosures of which 
are incorporated herein by reference in their entireties. 

In other preferred oligonucleotide mimetics, both the 
sugar and the internucleoside linkage are replaced with novel 
groups, such as peptide nucleic acids (PNA) . 

In PNA compounds, the phosphodiester backbone of the 
nucleic acid is replaced with an amide -containing backbone, in 
particular by repeating N- (2 -aminoethyl) glycine units linked 
by amide bonds. Nucleobases are bound directly or indirectly 
to aza nitrogen atoms of the amide portion of the backbone, 
typically by methylene carbonyl linkages. 

The uncharged nature of the PNA backbone provides 
PNA/DNA and PNA/RNA duplexes with a higher thermal stability 
than is found in DNA/DNA and DNA/RNA duplexes, resulting from 
the lack of charge repulsion between the PNA and DNA or RNA 
strand. In general, the Tm of a PNA/DNA or PNA/RNA duplex is 
1°C higher per base pair than the Tm of the corresponding 
DNA/DNA or DNA/RNA duplex (in 100 mM NaCl) . 

The neutral backbone also allows PNA to form stable 
DNA duplexes largely independent of salt concentration. At low 
ionic strength, PNA can be hybridized to a target sequence at 
temperatures that make DNA hybridization problematic or 
impossible. And unlike DNA/DNA duplex formation, PNA 
hybridization is possible in the absence of magnesium. 
Adjusting the ionic strength, therefore, is useful if competing 
DNA or RNA is present in the sample, or if the nucleic acid 
being probed contains a high level of secondary structure. 
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PNA also demonstrates greater specificity in binding 
to complementary DNA. A PNA/DNA mismatch is more destabilizing 
than DNA/ DNA mismatch. A single mismatch in mixed a PNA/DNA 
15-mer lowers the Tm by 8-20°C (15°C on average) . In the 
corresponding DNA/DNA duplexes, a single mismatch lowers the Tm 
by 4-1 6°C (11°C on average) . Because PNA probes can be 
significantly shorter than DNA probes, their specificity is 
greater. 

Additionally, nucleases and proteases do not 
recognize the PNA polyamide backbone with nucleobase 
sidechains. As a result, PNA oligomers are resistant to 
degradation by enzymes, and the lifetime of these compounds is 
extended both in vivo and in vitro. In addition, PNA is stable 
over a wide pH range. 

Because its backbone is formed from amide bonds, PNA 
can be synthesized using a modified peptide synthesis protocol. 

PNA oligomers can be synthesized by both Fmoc and tBoc 
methods. Representative U.S. patents that teach the 
preparation of PNA compounds include, but are not limited to, 
U.S. Pat. Nos. 5,539,082; 5,714,331; and 5,719,262, each of 
which is herein incorporated by reference; automated PNA 
synthesis is readily achievable on commercial synthesizers 
(see, e.g., "PNA User's Guide," Rev. 2, February 1998, 
Perseptive Biosystems Part No. 60138, Applied Biosystems, Inc., 
Foster City, CA) . 

PNA chemistry and applications are reviewed, inter 
alia, in Ray et al . , FASEB J. 14 ( 9) : 1041-60 (2000); Nielsen et 
al. t Pharmacol Toxicol. 86(1) :3-7 (2000); Larsen et al., 
Biochim Biophys Acta. 1489 (1) : 159-66 (1999); Nielsen, Curr. 
Opin. Struct. Biol. 9(3):353-7 (1999), and Nielsen, Curr. Opin. 
Biotechnol. 10(l):71-5 (1999), the disclosures of which are 
incorporated herein by reference in their entireties. 
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Differences from nucleic acid compositions found in 
nature — e.g., nonnative bases, altered internucleoside 
linkages, post -synthesis modification — can be present 
throughout the length of the nucleic acid or can, instead, 
5 usefully be localized to discrete portions thereof. As an 
example of the latter, chimeric nucleic acids can be 
synthesized that have discrete DNA and RNA domains and 
demonstrated utility for targeted gene repair, as further 
described in U.S. Pat. Nos. 5,760,012 and 5,731,181, the 
10 disclosures of which are incorporated herein by reference in 
H their entireties. As another example, chimeric nucleic acids 

:l] comprising both DNA and PNA have been demonstrated to have 

M utility in modified PCR reactions. See Misra et al . , Biochem. 

37: 1917-1925 (1998); see also Finn et al . , Nucl. Acids Res. 
15 24: 3357-3363 (1996), incorporated herein by reference. 

Unless otherwise specified, nucleic acids of the 
present invention can include any topological conformation 
appropriate to the desired use; the term thus explicitly 
fU comprehends, among others, single- stranded, double- stranded, 

20 triplexed, quadruplexed, partially double -stranded, partially- 
triplexed, partially-quadruplexed, branched, hairpinned, 
circular, and padlocked conformations. Padlock conformations 
and their utilities are further described in Baner et al., 
Curr. Opin. Biotechnol . 12:11-15 (2001); Escude et al., Proc . 
25 Natl. Acad. Sci. USA 14 ; 96 (19) : 10603-7 (1999); Nilsson et al . , 
Science 265 (5181) : 2085-8 (1994), the disclosures of which are 
incorporated herein by reference in their entireties. Triplex 
and quadruplex conformations, and their utilities, are reviewed 
in Praseuth et al . , Biochim. Biophys. Acta. 1489 (1) : 181-206 
30 (1999); Fox, Curr. Med. Chem. 7(l):17-37 (2000); Kochetkova et 
al., Methods Mol . Biol. 130:189-201 (2000); Chan et al . , J . 
Mol. Med. 75(4):267-82 (1997), the disclosures of which are 
incorporated herein by reference in their entireties. 
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The nucleic acids of the present invention can be 
detectably labeled. 

Commonly- used labels include radionuclides, such as 
32 P, 33 P, 35 S, 3 H (and for NMR detection, 13 C and 15 N) , haptens 
that can be detected by specific antibody or high affinity 
binding partner (such as avidin) , and f luorophores . 

As noted above, detectable labels can be incorporated 
by inclusion of labeled nucleotide analogues in the nucleic 
acid. Such analogues can be incorporated by enzymatic 
polymerization, such as by nick translation, random priming, 
polymerase chain reaction (PCR) , terminal transferase tailing, 
and end- filling of overhangs, for DNA molecules, and in vitro 
transcription driven, e.g., from phage promoters, such as T7, 
T3, and SP6, for RNA molecules. Commercial kits are readily 
available for each such labeling approach. 

Analogues can also be incorporated during automated 
solid phase chemical synthesis. 

As is well known, labels can also be incorporated 
after nucleic acid synthesis, with the 5' phosphate and 3' 
hydroxyl providing convenient sites for post -synthetic covalent 
attachment of detectable labels. 

Various other post -synthetic approaches permit 
internal labeling of nucleic acids. 

For example, f luorophores can be attached using a 
cisplatin reagent that reacts with the N7 of guanine residues 
(and, to a lesser extent, adenine bases) in DNA, RNA, and PNA 
to provide a stable coordination complex between the nucleic 
acid and fluorophore label (Universal Linkage System) 
(available from Molecular Probes, Inc., Eugene, OR, USA and 
Amersham Pharmacia Biotech, Piscataway, NJ, USA) ; see Alers et 
al . , Genes, Chromosomes & Cancer, Vol. 25, pp. 301 - 305 
(1999); Jelsma et al . , J. NIH Res. 5:82 (1994); Van Belkum et 
al., BioTechniques 16:148-153 (1994), incorporated herein by 
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reference. As another example, nucleic acids can be labeled 
using a disulfide- containing linker (FastTag™ Reagent, Vector 
Laboratories, Inc., Burlingame, CA, USA) that is photo- or 
thermally coupled to the target nucleic acid using aryl azide 
chemistry; after reduction, a free thiol is available for 
coupling to a hapten, fluorophore, sugar, affinity ligand, or 
other marker. 

Multiple independent or interacting labels can be 
incorporated into the nucleic acids of the present invention. 

For example, both a fluorophore and a moiety that in 
proximity thereto acts to quench fluorescence can be included 
to report specific hybridization through release of 
fluorescence quenching, Tyagi et al . , Nature Biotechnol . 14: 
303-308 (1996); Tyagi et al . , Nature Biotechnol . 16, 49-53 
(1998); Sokol et al . , Proc . Natl. Acad. Sci. USA 95: 
11538-11543 (1998); Kostrikis et al . , Science 279:1228-1229 
(1998); Marras et al., Genet. Anal. 14: 151-156 (1999); U.S. 
Pat. Nos. 5,846,726, 5,925,517, 5925517, or to report 
exonucleotidic excision, U.S. Pat. No. 5,538,848; Holland et 
al., Proc. Natl. Acad. Sci. USA 88:7276-7280 (1991); Heid et 
al., Genome Res. 6(10):986-94 (1996); Kuimelis et al . , Nucleic 
Acids Symp Ser. (37):255-6 (1997); U.S. Patent No. 5,723,591, 
the disclosures of which are incorporated herein by reference 
in their entireties. 

So labeled, the isolated nucleic acids of the present 
invention can be used as probes, as further described below. 

Nucleic acids of the present invention can also 
usefully be bound to a substrate. The substrate can porous or 
solid, planar or non-planar, unitary or distributed; the bond 
can be covalent or noncovalent. Bound to a substrate, nucleic 
acids of the present invention can be used as probes in their 
unlabeled state. 
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For example, the nucleic acids of the present 
invention can usefully be bound to a porous substrate, commonly 
a membrane, typically comprising nitrocellulose, nylon, or 
positively- charged derivatized nylon; so attached, the nucleic 
5 acids of the present invention can be used to detect HTPL 
nucleic acids present within a labeled nucleic acid sample, 
either a sample of genomic nucleic acids or a sample of 
transcript -derived nucleic acids, e.g. by reverse dot blot. 

The nucleic acids of the present invention can also 

10 usefully be bound to a solid substrate, such as glass, although 
other solid materials, such as amorphous silicon, crystalline 
silicon, or plastics, can also be used. Such plastics include 
polymethylacrylic, polyethylene, polypropylene, polyacrylate, 
polymethylmethacrylate , polyvinylchloride , 

15 polytetraf luoroethylene, polystyrene, polycarbonate, 

polyacetal, polysulfone, celluloseacetate, cellulosenitrate , 
nitrocellulose, or mixtures thereof. 

Typically, the solid substrate will be rectangular, 
although other shapes, particularly disks and even spheres, 

20 present certain advantages. Particularly advantageous 

alternatives to glass slides as support substrates for array of 
nucleic acids are optical discs, as described in Demers, 
"Spatially Addressable Combinatorial Chemical Arrays in CD-ROM 
Format," international patent publication WO 98/12559, 

25 incorporated herein by reference in its entirety. 

The nucleic acids of the present invention can be 
attached covalently to a surface of the support substrate or 
applied to a derivatized surface in a chaotropic agent that 
facilitates denaturation and adherence by presumed noncovalent 

30 interactions, or some combination thereof. 

The nucleic acids of the present invention can be 
bound to a substrate to which a plurality of other nucleic 
acids are concurrently bound, hybridization to each of the 
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plurality of bound nucleic acids being separately detectable. 
At low density, e.g. on a porous membrane, these substrate - 
bound collections are typically denominated macroarrays; at 
higher density, typically on a solid support, such as glass, 
these substrate bound collections of plural nucleic acids are 
colloquially termed microarrays. As used herein, the term 
microarray includes arrays of all densities. It is, therefore, 
another aspect of the invention to provide microarrays that 
include the nucleic acids of the present invention. 

The isolated nucleic acids of the present invention 
can be used as hybridization probes to detect, characterize, 
and quantify HTPL nucleic acids in, and isolate HTPL nucleic 
acids from, both genomic and transcript -derived nucleic acid 
samples. When free in solution, such probes are typically, but 
not invariably, detectably labeled; bound to a substrate, as in 
a microarray, such probes are typically, but not invariably 
unlabeled. 

For example, the isolated nucleic acids of the 
present invention can be used as probes to detect and 
characterize gross alterations in the HTPL genomic locus, such 
as deletions, insertions, translocations, and duplications of 
the HTPL genomic locus through fluorescence in situ 
hybridization (FISH) to chromosome spreads. See, e.g., 
Andreeff et al . (eds.), Introduction to Fluorescence In Situ 
Hybridization: Principles and Clinical Applications , John Wiley 
& Sons (1999) (ISBN: 0471013455), the disclosure of which is 
incorporated herein by reference in its entirety. The isolated 
nucleic acids of the present invention can be used as probes to 
assess smaller genomic alterations using, e.g., Southern blot 
detection of restriction fragment length polymorphisms. The 
isolated nucleic acids of the present invention can be used as 
probes to isolate genomic clones that include the nucleic acids 
of the present invention, which thereafter can be restriction 
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mapped and sequenced to identify deletions, insertions , 
translocations, and substitutions (single nucleotide 
polymorphisms, SNPs) at the sequence level. 

The isolated nucleic acids of the present invention 
can also be used as probes to detect, characterize, and 
quantify HTPL nucleic acids in, and isolate HTPL nucleic acids 
from, transcript -derived nucleic acid samples. 

For example, the isolated nucleic acids of the 
present invention can be used as hybridization probes to 
detect, characterize by length, and quantify HTPL mRNA by 
northern blot of total or poly-A + - selected RNA samples. For 
example, the isolated nucleic acids of the present invention 
can be used as hybridization probes to detect, characterize by 
location, and quantify HTPL message by in situ hybridization to 
tissue sections (see, e.g., Schwarchzacher et al . , In Situ 
Hybridization , Springer-Verlag New York (2000) (ISBN: 
0387915966) , the disclosure of which is incorporated herein by 
reference in its entirety) . For example, the isolated nucleic 
acids of the present invention can be used as hybridization 
probes to measure the representation of HTPL clones in a cDNA 
library. For example, the isolated nucleic acids of the 
present invention can be used as hybridization probes to 
isolate HTPL nucleic acids from cDNA libraries, permitting 
sequence level characterization of HTPL messages, including 
identification of deletions, insertions, truncations — 
including deletions, insertions, and truncations of exons in 
alternatively spliced forms — and single nucleotide 
polymorphisms . 

All of the aforementioned probe techniques are well 
within the skill in the art, and are described at greater 
length in standard texts such as Sambrook et al . , Molecular 
Cloning: A Laboratory Manual (3 rd ed.), Cold Spring Harbor 
Laboratory Press (2001) (ISBN: 0879695773); Ausubel et al . 
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(eds.), Short Protocols in Molecular Biology: A Compendium of 
Methods from Current Protocols in Molecular Biology (4 th ed.), 
John Wiley & Sons, 1999 (ISBN: 047132938X) ; and Walker et al . 
(eds.) , The Nucleic Acids Protocols Handbook , Humana Press 
5 (2000) (ISBN: 0896034593), the disclosures of which are 
incorporated herein by reference in their entirety. 

As described in the Examples herein below, the 
nucleic acids of the present invention can also be used to 
detect and guantify HTPL nucleic acids in transcript -derived 
10 samples - that is, to measure expression of the HTPL gene — 

when included in a microarray. Measurement of HTPL expression 
has particular utility in diagnosis and treatment of male 
infertility and tumor, as further described in the Examples 
herein below. 

15 As would be readily apparent to one of skill in the 

art, each HTPL nucleic acid probe - whether labeled, substrate- 
bound, or both - is thus currently available for use as a tool 
for measuring the level of HTPL expression in the tissue in 
which expression has already been confirmed, notably testis, 
20 also adrenal, adult and fetal liver, bone marrow, brain, 

kidney, lung, placenta, prostate, skeletal muscle and colon. 
The utility is specific to the probe: under high stringency 
conditions, the probe reports the level of expression of 
message specifically containing that portion of the HTPL gene 
25 included within the probe. 

Measuring tools are well known in many arts, not just 
in molecular biology, and are known to possess credible, 
specific, and substantial utility. For example, U.S. Patent 
No. 6,016,191 describes and claims a tool for measuring 
30 characteristics of fluid flow in a hydrocarbon well; U.S. 
Patent No. 6,042,549 describes and claims a device for 
measuring exercise intensity; U.S. Patent No. 5,889,351 
describes and claims a device for measuring viscosity and for 



measuring characteristics of a fluid; U.S. Patent No. 5,570,694 
describes and claims a device for measuring blood pressure; 
U.S. Patent No. 5,930,143 describes and claims a device for 
measuring the dimensions of machine tools; U.S. Patent No. 
5,279,044 describes and claims a measuring device for 
determining an absolute position of a movable element; U.S. 
Patent No. 5,186,042 describes and claims a device for 
measuring action force of a wheel; and U.S. Patent No. 
4,246,774 describes and claims a device for measuring the draft 
of smoking articles such as cigarettes. 

as for tissues not yet demonstrated to express HTPL, 
the HTPL nucleic acid probes of the present invention are 
currently available as tools for surveying such tissues to 
detect the presence of HTPL nucleic acids. 

Survey tools - i.e., tools for determining the 
presence and/or location of a desired object by search of an 
area - are well known in many arts, not just in molecular 
biology, and are known to possess credible, specific, and 
substantial utility. For example, U.S. Patent No. 6,046,800 
describes and claims a device for surveying an area for objects 
that move; U.S. Patent No. 6,025,2 01 describes and claims an 
apparatus for locating and discriminating platelets from 
non-platelet particles or cells on a cell-by-cell basis in a 
whole blood sample; U.S. Patent No. 5,990,689 describes and 
claims a device for detecting and locating anomalies in the 
electromagnetic protection of a system; U.S. Patent No. 
5,984,175 describes and claims a device for detecting and 
identifying wearable user identification units; U.S. Patent No. 
3,980,986 ("Oil well survey tool"), describes and claims a tool 
for finding the position of a drill bit working at the bottom 
of a borehole . 

As noted above, the nucleic acid probes of the 
present invention are useful in constructing microarrays; the 
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microarrays, in turn, are products of manufacture that are 
useful for measuring and for surveying gene expression. 

When included on a microarray, each HTPL nucleic acid 
probe makes the microarray specifically useful for detecting 
5 that portion of the HTPL gene included within the probe, thus 
imparting upon the microarray device the ability to detect a 
signal where, absent such probe, it would have reported no 
signal. This utility makes each individual probe on such 
microarray akin to an antenna, circuit, firmware or software 

10 element included in an electronic apparatus, where the antenna, 
circuit, firmware or software element imparts upon the 
apparatus the ability newly and additionally to detect signal 
in a portion of the radio- frequency spectrum where previously 
it could not; such devices are known to have specific, 

15 substantial, and credible utility. 

Changes in the level of expression need not be 
observed for the measurement of expression to have utility. 

For example, where gene expression analysis is used 
to assess toxicity of chemical agents on cells, the failure of 

20 the agent to change a gene's expression level is evidence that 
the drug likely does not affect the pathway of which the gene's 
expressed protein is a part. Analogously, where gene 
expression analysis is used to assess side effects of 
pharmacologic agents — whether in lead compound discovery or in 

25 subsequent screening of lead compound derivatives — the 

inability of the agent to alter a gene's expression level is 
evidence that the drug does not affect the pathway of which the 
gene • s expressed protein is a part . 

WO 99/58720, incorporated herein by reference in its 

30 entirety, provides methods for quantifying the relatedness of a 
first and second gene expression profile and for ordering the 
relatedness of a plurality of gene expression profiles, without 
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regard to the identity or function of the genes whose 
expression is used in the calculation. 

Gene expression analysis, including gene expression 
analysis by microarray hybridization, is, of course, 
principally a laboratory-based art. Devices and apparatus used 
principally in laboratories to facilitate laboratory research 
are well-established to possess specific, substantial, and 
credible utility. For example, U.S. Patent No. 6,001,233 
describes and claims a gel electrophoresis apparatus having a 
cam-activated clamp; for example, U.S. Patent No. 6,051,831 
describes and claims a high mass detector for use in time -of - 
flight mass spectrometers; for example, U.S. Patent NO. 
5,824,269 describes and claims a flow cytometer— as is well 
known, few gel electrophoresis apparatuses, TOF-MS devices, or 
flow cytometers are sold for consumer use. 

Indeed, and in particular, nucleic acid microarrays, 
as devices intended for laboratory use in measuring gene 
expression, are well-established to have specific, substantial 
and credible utility. Thus, the microarrays of the present 
invention have at least the specific, substantial and credible 
utilities of the microarrays claimed as devices and articles of 
manufacture in the following U.S. patents, the disclosures of 
each of which is incorporated herein by reference: U.S. Patent 
Nos. 5,44 5,934 ("Array of oligonucleotides on a solid 
substrate"); 5,744,305 ("Arrays of materials attached to a 
substrate"); and 6,004,752 ("Solid support with attached 
molecules") . 

Genome -derived single exon probes and genome -derived 
single exon probe microarrays have the additional utility, 
inter alia, of permitting high- throughput detection of splice 
variants of the nucleic acids of the present invention, as 
further described in copending and commonly owned U.S. Patent 
application no. 09/632,366, filed August 3, 2000, the 
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disclosure of which is incorporated herein by reference in its 
entirety. 

The isolated nucleic acids of the present invention 
can also be used to prime synthesis of nucleic acid, for 
5 purpose of either analysis or isolation, using mRNA, cDNA, or 
genomic DNA as template. 

For use as primers, at least 17 contiguous 
nucleotides of the isolated nucleic acids of the present 
invention will be used. Often, at least 18, 19, or 20 

10 contiguous nucleotides of the nucleic acids of the present 

invention will be used, and on occasion at least 20, 22, 24, or 
25 contiguous nucleotides of the nucleic acids of the present 
invention will be used, and even 30 nucleotides or more of the 
nucleic acids of the present invention can be used to prime 

15 specific synthesis. 

The nucleic acid primers of the present invention can 
be used, for example, to prime first strand cDNA synthesis on 
an mRNA template. 

Such primer extension can be done directly to analyze 

20 the message. Alternatively, synthesis on an mRNA template can 
be done to produce first strand cDNA. The first strand cDNA 
can thereafter be used, inter alia, directly as a single- 
stranded probe, as above-described, as a template for 
sequencing — permitting identification of alterations, 

25 including deletions, insertions, and substitutions, both normal 
allelic variants and mutations associated with abnormal 
phenotypes— or as a template, either for second strand cDNA 
synthesis {e.g., as an antecedent to insertion into a cloning 
or expression vector) , or for amplification. 

30 The nucleic acid primers of the present invention can 

also be used, for example, to prime single base extension (SBE) 
for SNP detection (see, e.g., U.S. Pat. No. 6,004,744, the 
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disclosure of which is incorporated herein by reference in its 
entirety) . 

As another example, the nucleic acid primers of the 
present invention can be used to prime amplification of HTPL 
nucleic acids, using transcript -derived or genomic DNA as 
template . 

Primer-directed amplification methods are now well- 
established in the art. Methods for performing the polymerase 
chain reaction (PCR) are compiled, inter alia, in McPherson, 
PCR (Basics: From Background to Bench) , Springer Verlag (2000) 
(ISBN: 0387916008); Innis et al . (eds.), PCR Applications: 
Protocols for Functional Genomics , Academic Press (1999) (ISBN: 
0123721857); Gelfand et al . (eds.), PCR Strategies , Academic 
Press (1998) (ISBN: 0123721822); Newton et al . , PCR , 
Springer-Verlag New York (1997) (ISBN: 0387915060); Burke 
(ed.), PCR: Essential Techniques , John Wiley & Son Ltd (1996) 
(ISBN: 047195697X) ; White (ed.), PCR Cloning Protocols : From 
Molecular Cloning to Genetic Engineering, Vol. 67, Humana Press 
(1996) (ISBN: 0896033430); McPherson et al . (eds.), PCR 2: A 
Practical Approach , Oxford University Press, Inc. (1995) (ISBN: 
0199634254), the disclosures of which are incorporated herein 
by reference in their entireties. Methods for performing RT- 
PCR are collected, e.g., in Siebert et al. (eds.), Gene Cloning 
and Analysis by RT-PCR , Eaton Publishing Company/Bio Techniques 
Books Division, 1998 (ISBN: 1881299147); Siebert (ed.), PCR 
Technique : RT - PCR , Eaton Publishing Company/BioTechniques Books 
(1995) (ISBN:1881299139) , the disclosure of which is 
incorporated herein by reference in its entirety. 

Isothermal amplification approaches, such as rolling 
circle amplification, are also now well -described. See, e.g., 
Schweitzer et al . , Curr. Opin. Biotechnol . 12(l):21-7 (2001); 
U.S. Patent Nos . 6,235,502, 6,221,603, 6,210,884, 6,183,960, 
5,854,033, 5,714,320, 5,648,245, and international patent 
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publications WO 97/19193 and WO 00/15779, the disclosures of 
which are incorporated herein by reference in their entireties. 

Rolling circle amplification can be combined with other 
techniques to facilitate SNP detection. See, e.g., Lizardi et 
5 al., Nature Genet. 19(3):225-32 (1998). 

As further described below, nucleic acids of the 
present invention, inserted into vectors that flank the nucleic 
acid insert with a phage promoter, such as T7, T3, or SP6 
promoter, can be used to drive in vitro expression of RNA 
10 complementary to either strand of the nucleic acid of the 
present invention. The RNA can be used, inter alia f as a 
single- stranded probe, in cDNA-mRNA subtraction, or for in 
vitro translation. 

As will be further discussed herein below, nucleic 
15 acids of the present invention that encode HTPL protein or 

portions thereof can be used, inter alia, to express the HTPL 
proteins or protein fragments, either alone, or as part of 
fusion proteins. 

Expression can be from genomic nucleic acids of the 
20 present invention, or from transcript -derived nucleic acids of 
the present invention. 

Where protein expression is effected from genomic 
DNA, expression will typically be effected in eukaryotic, 
typically mammalian, cells capable of splicing introns from the 
25 initial RNA transcript. Expression can be driven from episomal 
vectors, such as EBV-based vectors, or can be effected from 
genomic DNA integrated into a host cell chromosome. As will be 
more fully described below, where expression is from 
transcript -derived (or otherwise intron-less) nucleic acids of 
30 the present invention, expression can be effected in wide 
variety of prokaryotic or eukaryotic cells. 

Expressed in vitro, the protein, protein fragment, or 
protein fusion can thereafter be isolated, to be used, inter 



alia, as a standard in immunoassays specific for the proteins, 
or protein isoforms, of the present invention; to be used as a 
therapeutic agent, e.g., to be administered as passive 
replacement therapy in individuals deficient in the proteins of 
the present invention, or to be administered as a vaccine; to 
be used for in vitro production of specific antibody, the 
antibody thereafter to be used, e.g., as an analytical reagent 
for detection and quantitation of the proteins of the present 
invention or to be used as an immunotherapeutic agent. 

The isolated nucleic acids of the present invention 
can also be used to drive in vivo expression of the proteins of 
the present invention. In vivo expression can be driven from a 
vector — typically a viral vector, often a vector based upon a 
replication incompetent retrovirus, an adenovirus, or an adeno- 
associated virus (AAV) — for purpose of gene therapy. In vivo 
expression can also be driven from signals endogenous to the 
nucleic acid or from a vector, often a plasmid vector, such as 
pVAXl (Invitrogen, Carlsbad CA, USA), for purpose of "naked" 
nucleic acid vaccination, as further described in U.S. Pat. 
Nos. 5,589,466; 5,679,647; 5,804,566; 5,830,877; 5,843,913; 
5,880,104; 5,958,891; 5,985,847; 6,017,897; 6,110,898; 
6,204,250, the disclosures of which are incorporated herein by 
reference in their entireties. 

The nucleic acids of the present invention can also 
be used for antisense inhibition of transcription or 
translation. See Phillips (ed.), Antisense Technology, Part B , 
Methods in Enzymology Vol. 314, Academic Press, Inc. (1999) 
(ISBN: 012182215X); Phillips (ed.), Antisense Technology, Part 
A, Methods in Enzymology Vol. 313, Academic Press, Inc. (1999) 
(ISBN: 0121822141); Hartmann et al . (eds.), Manual of Antisense 
Methodology (Perspectives in Antisense Science) , Kluwer Law 
International (1999) (ISBN: 079238539X) ; Stein etal. (eds.), 
Applied Antisense Oligonucleotide Technology , Wiley-Liss (cover 
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(1998) (ISBN: 0471172790); Agrawal et al . (eds.), Antisense 
Research and Application , Springer-Verlag New York, Inc. (1998) 
(ISBN: 3540638334); Lichtenstein et al . (eds.), Antisense 
Technology: A Practical Approach , Vol. 185, Oxford University 
Press, INC. (1998) (ISBN: 0199635838); Gibson (ed.) # Antisense 
and Ribozyme Methodology: Laboratory Companion , Chapman & Hall 
(1997) (ISBN: 3826100794); Chadwick et al . (eds.), 
Oligonucleotides as Therapeutic Agents - Symposium No. 2 09 , 
John Wiley & Son Ltd (1997) (ISBN: 0471972797), the disclosures 
of which are incorporated herein by reference in their 
entireties . 

Nucleic acids of the present invention, particularly 
cDNAs of the present invention, that encode full-length human 
HTPL protein isoforms, have additional, well -recognized, 
immediate, real world utility as commercial products of 
manufacture suitable for sale. 

For example, Invitrogen Corp. (Carlsbad, CA, USA) , 
through its Research Genetics subsidiary, sells full length 
human cDNAs cloned into one of a selection of expression 
vectors as GeneStorm® expression-ready clones; utility is 
specific for the gene, since each gene is capable of being 
ordered separately and has a distinct catalogue number, and 
utility is substantial, each clone selling for $650.00 US. 
Similarly, Incyte Genomics (Palo Alto, CA, USA) sells clones 
from public and proprietary sources in multi-well plates or 
individual tubes . 

Nucleic acids of the present invention that include 
genomic regions encoding the human HTPL protein, or portions 
thereof, have yet further utilities. 

For example, genomic nucleic acids of the present 
invention can be used as amplification substrates, e.g. for 
preparation of genome -derived single exon probes of the present 
invention, as described above and in copending and commonly- 
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owned U.S. patent application nos . 09/864,761, filed May 23, 
2001, 09/774,203 , filed January 29, 2001, and 09/632,366, fil< 
August 3, 2 000, the disclosures of which are incorporated 
herein by reference in their entireties. 

As another example, genomic nucleic acids of the 
present invention can be integrated non- homologous ly into the 
genome of somatic cells, e.g. CHO cells, COS cells, or 293 
cells, with or without amplification of the insertional locus 
in order, e.g., to create stable cell lines capable of 
producing the proteins of the present invention. 

As another example, more fully described herein 
below, genomic nucleic acids of the present invention can be 
integrated nonhomologous ly into embryonic stem (ES) cells to 
create transgenic non-human animals capable of producing the 
proteins of the present invention. 

Genomic nucleic acids of the present invention can 
also be used to target homologous recombination to the human 
HTPL locus. See, e.g., U.S. Patent Nos. 6,187,305; 6,204,061 
5,631,153; 5,627,059; 5,487,992; 5,464,764; 5,614,396; 
5,527,695 and 6,063,630; and Kmiec et al . (eds.), Gene 
Targeting Protocols , Vol. 133, Humana Press (2000) (ISBN: 
0896033600); Joyner (ed.), Gene Targeting: A Practical 
Approach , Oxford University Press, Inc. (2000) (ISBN: 
0199637938); Sedivy et al . , Gene Targeting , Oxford University 
Press (1998) (ISBN: 071677013X) ; Tymms et al . (eds.), Gene 
Knockout Protocols , Humana Press (2000) (ISBN: 0896035727) ; M. 
et al. (eds.), The Gene Knockout FactsBook , Vol. 2, Academic 
Press, Inc. (1998) (ISBN: 0124660444); Torres et al . , 
Laboratory Protocols for Conditional Gene Targeting , Oxford 
University Press (1997) (ISBN: 019963677X) ; Vega (ed.), Gene 
Targeting , CRC Press, LLC (1994) (ISBN: 084938950X) , the 
disclosures of which are incorporated herein by reference in 
their entireties. 
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Where the genomic region includes transcription 
regulatory elements, homologous recombination can be used to 
alter the expression of HTPL, both for purpose of in vitro 
production of HTPL protein from human cells, and for purpose of 
5 gene therapy. See, e.g., U.S. Pat. Nos . 5,981,214, 6,048,524; 
5,272,071. 

Fragments of the nucleic acids of the present 
invention smaller than those typically used for homologous 
recombination can also be used for targeted gene correction or 
u 10 alteration, possibly by cellular mechanisms different from 
those engaged during homologous recombination. 

For example, partially duplexed RNA/DNA chimeras have 
been shown to have utility in targeted gene correction, U.S. 
U] Pat. Nos. 5,945,339, 5,888,983, 5,871,984, 5,795,972, 

Dl 15 5,780,296, 5,760,012, 5,756,325, 5,731,181, the disclosures of 
CI which are incorporated herein by reference in their entireties. 

So too have small oligonucleotides fused to triplexing domains 
□ have been shown to have utility in targeted gene correction, 

Culver et al . , "Correction of chromosomal point mutations in 
20 human cells with bifunctional oligonucleotides," Nature 

Biotechnol. 17 (10 ): 989-93 (1999), as have oligonucleotides 
having modified terminal bases or modified terminal 
internucleoside bonds, Gamper et al . , Nucl. Acids Res, 
28 (21) :4332-9 (2000), the disclosures of which are incorporated 
25 herein by reference. 

The isolated nucleic acids of the present invention 
can also be used to provide the initial substrate for 
recombinant engineering of HTPL protein variants having desired 
phenotypic improvements. Such engineering includes, for 
30 example, site-directed mutagenesis, random mutagenesis with 
subsequent functional screening, and more elegant schemes for 
recombinant evolution of proteins, as are described, inter 
alia, in U.S. Pat. Nos. 6,180,406; 6,165,793; 6,117,679; and 
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6,096,548, the disclosures of which are incorporated herein by- 
reference in their entireties. 

Nucleic acids of the present invention can be 
obtained by using the labeled probes of the present invention 
to probe nucleic acid samples, such as genomic libraries, cDNA 
libraries, and mRNA samples, by standard techniques. Nucleic 
acids of the present invention can also be obtained by 
amplification, using the nucleic acid primers of the present 
invention, as further demonstrated in Example 1, herein below. 

Nucleic acids of the present invention of fewer than about 100 
nt can also be synthesized chemically, typically by solid phase 
synthesis using commercially available automated synthesizers. 

"Full Length" HTPL Nucleic Acids 

In a first series of nucleic acid embodiments, the 
invention provides isolated nucleic acids that encode the 
entirety of the HTPL protein. As discussed above, the "full- 
length" nucleic acids of the present invention can be used, 
inter alia, to express full length HTPL protein. The full- 
length nucleic acids can also be used as nucleic acid probes; 
used as probes, the isolated nucleic acids of these embodiments 
will hybridize to HTPL. 

In a first such embodiment, the invention provides an 
isolated nucleic acid comprising (i) the nucleotide sequence of 
SEQ ID NO: 1, or (ii) the complement of (i) . SEQ ID NO: 1 
presents the entire cDNA of HTPL-L, including the 5' 
untranslated (UT) region and 3' UT. 

In a second embodiment, the invention provides an 
isolated nucleic acid comprising (i) the nucleotide sequence of 
SEQ ID NO: 2, (ii) a degenerate variant of the nucleotide 
sequence of SEQ ID NO: 2, or (iii) the complement of (i) or 
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(ii) . SEQ ID NO: 2 presents the open reading frame (ORF) from 
SEQ ID NO: 1. 

In a third embodiment, the invention provides an 
isolated nucleic acid comprising (i) a nucleotide sequence that 
5 encodes a polypeptide with the amino acid sequence of SEQ ID 
NO: 3 or (ii) the complement of a nucleotide sequence that 
encodes a polypeptide with the amino acid sequence of SEQ ID 
NO: 3. SEQ ID NO: 3 provides the amino acid sequence of HTPL- 
L. 

M 10 In a fourth embodiment, the invention provides an 

f*| isolated nucleic acid having a nucleotide sequence that (i) 

CH encodes a polypeptide having the sequence of SEQ ID NO: 3, (ii) 

n 

V'i encodes a polypeptide having the sequence of SEQ ID NO: 3 with 

Ul conservative amino acid substitutions, or (iii) that is the 

f 15 complement of (i) or (ii) , where SEQ ID NO: 3 provides the 
H amino acid sequence of HTPL-L . 

Li 

In another such embodiment, the invention provides an 
isolated nucleic acid comprising (i) the nucleotide sequence of 
SEQ ID NO: 4, or (ii) the complement of (i) . SEQ ID NO: 4 
20 presents the entire cDNA of HTPL-S, including the 5' 
untranslated (UT) region and 3' UT. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising (i) the nucleotide sequence of 
SEQ ID NO: 5, (ii) a degenerate variant of the nucleotide 
25 sequence of SEQ ID NO: 5, or (iii) the complement of (i) or 

(ii) . SEQ ID NO: 5 presents the open reading frame (ORF) from 
SEQ ID NO: 4. 

In yet another embodiment, the invention provides an 
isolated nucleic acid comprising (i) a nucleotide sequence that 
30 encodes a polypeptide with the amino acid sequence of SEQ ID 
NO: 6 or (ii) the complement of a nucleotide sequence that 
encodes a polypeptide with the amino acid sequence of SEQ ID 
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NO: 6. SEQ ID NO : 6 provides the amino acid sequence of HTPL- 
S. 

In still another embodiment, the invention provides 
an isolated nucleic acid having a nucleotide sequence that (i) 
encodes a polypeptide having the sequence of SEQ ID NO: 6, (ii) 
encodes a polypeptide having the sequence of SEQ ID NO: 6 with 
conservative amino acid substitutions, or (iii) that is the 
complement of (i) or (ii) , where SEQ ID NO: 6 provides the 
amino acid sequence of HTPL-S. 

Selected Partial Nucleic Acids 

In a second series of nucleic acid embodiments, the 
invention provides isolated nucleic acids that encode select 
portions of HTPL. As will be further discussed herein below, 
these "partial" nucleic acids can be used, inter alia, to 
express specific portions of the HTPL. These "partial" nucleic 
acids can also be used, inter alia, as nucleic probes. 

In a first such embodiment, the invention provides an 
isolated nucleic acid comprising (i) the nucleotide sequence of 
SEQ ID NO: 7, (ii) a degenerate variant of SEQ ID NO: 9, or 
(iii) the complement of (i) or (ii) , wherein the isolated 
nucleic acid is no more than about 100 kb in length, typically 
no more than about 75 kb in length, more typically no more than 
about 50 kb length. SEQ ID NO: 7 encodes a novel portion of 
HTPL-L. Often, the isolated nucleic acids of this embodiment 
are no more than about 25 kb in length, often no more than 
about 15 kb in length, and frequently no more than about 10 kb 
in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising (i) a nucleotide sequence that 
encodes SEQ ID NO: 10 or (ii) the complement of a nucleotide 
sequence that encodes SEQ ID NO: 10, wherein the isolated 
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nucleic acid is no more than about 100 kb in length, typically 
no more than about 75 kb in length, frequently no more than 
about 50 kb in length. SEQ ID NO: 10 is the amino acid sequence 
encoded by the portion of HTPL-L not found in any EST 
fragments. Often, the isolated nucleic acids of this embodiment 
are no more than about 25 kb in length, often no more than 
about 15 kb in length, and frequently no more than about 10 kb 
in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising (i) a nucleotide sequence that 
encodes SEQ ID NO: 10, (ii) a nucleotide sequence that encodes 
SEQ ID NO: 10 with conservative substitutions, or (iii) the 
complement of (i) or (ii) , wherein the isolated nucleic acid is 
no more than about 100 kb in length, typically no more than 
about 75 kb in length, and often no more than about 50 kb in 
length. Often, the isolated nucleic acids of this embodiment 
are no more than about 25 kb in length, often no more than 
about 15 kb in length, and frequently no more than about 10 kb 
in length. 

In another such embodiment, the invention provides an 
isolated nucleic acid comprising (i) the nucleotide sequence of 
SEQ ID NO: 11, (ii) a degenerate variant of SEQ ID NO: 12, or 
(iii) the complement of (i) or (ii) , wherein the isolated 
nucleic acid is no more than about 100 kb in length, typically 
no more than about 75 kb in length, more typically no more than 
about 50 kb length. SEQ ID NO: 11 encodes a novel portion of 
HTPL-L. Often, the isolated nucleic acids of this embodiment 
are no more than about 25 kb in length, often no more than 
about 15 kb in length, and frequently no more than about 10 kb 
in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising (i) a nucleotide sequence that 
encodes SEQ ID NO: 14 or (ii) the complement of a nucleotide 
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sequence that encodes SEQ ID NO: 14, wherein the isolated 
nucleic acid is no more than about 100 kb in length, typically 
no more than about 75 kb in length, frequently no more than 
about 50 kb in length. SEQ ID NO: 14 is the amino acid sequence 
encoded by the portion of HTPL-L not found in any EST 
fragments. Often, the isolated nucleic acids of this embodiment 
are no more than about 25 kb in length, often no more than 
about 15 kb in length, and frequently no more than about 10 kb 
in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising (i) a nucleotide sequence that 
encodes SEQ ID NO: 14, (ii) a nucleotide sequence that encodes 
SEQ ID NO: 14 with conservative substitutions, or (iii) the 
complement of (i) or (ii) , wherein the isolated nucleic acid is 
no more than about 100 kb in length, typically no more than 
about 75 kb in length, and often no more than about 50 kb in 
length. Often, the isolated nucleic acids of this embodiment 
are no more than about 25 kb in length, often no more than 
about 15 kb in length, and frequently no more than about 10 kb 
in length. 

In another such embodiment, the invention provides an 
isolated nucleic acid comprising (i) the nucleotide sequence of 
SEQ ID NO: 4796, (ii) a degenerate variant of SEQ ID NO: 4798, 
or (iii) the complement of (i) or (ii) , wherein the isolated 
nucleic acid is no more than about 100 kb in length, typically 
no more than about 75 kb in length, more typically no more than 
about 50 kb length. SEQ ID NO: 4796 encodes a novel portion of 
HTPL-S. Often, the isolated nucleic acids of this embodiment 
are no more than about 25 kb in length, often no more than 
about 15 kb in length, and frequently no more than about 10 kb 
in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising (i) a nucleotide sequence that 
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encodes SEQ ID NO: 4799 or (ii) the complement of a nucleotide 
sequence that encodes SEQ ID NO: 4799, wherein the isolated 
nucleic acid is no more than about 100 kb in length, typically 
no more than about 75 kb in length, frequently no more than 
about 50 kb in length. SEQ ID NO: 4799 is the amino acid 
sequence encoded by the portion of HTPL-S not found in any EST 
fragments. Often, the isolated nucleic acids of this embodiment 
are no more than about 25 kb in length, often no more than 
about 15 kb in length, and frequently no more than about 10 kb 
in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising (i) a nucleotide sequence that 
encodes SEQ ID NO: 4 799, (ii) a nucleotide sequence that 
encodes SEQ ID NO: 4799 with conservative substitutions, or 
(iii) the complement of (i) or (ii) , wherein the isolated 
nucleic acid is no more than about 100 kb in length, typically 
no more than about 75 kb in length, and often no more than 
about 50 kb in length. Often, the isolated nucleic acids of 
this embodiment are no more than about 25 kb in length, often 
no more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

In another such embodiment, the invention provides an 
isolated nucleic acid comprising (i) the nucleotide sequence of 
SEQ ID NO: 4800, or (ii) the complement of (i) , wherein the 
isolated nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, more typically no 
more than about 50 kb length. SEQ ID NO: 4800 encodes a novel 
portion of HTPL-S. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 
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Cross-Hybridizing Nucleic Acids 

In another series of nucleic acid embodiments, the 
invention provides isolated nucleic acids that hybridize to 
5 various of the HTPL nucleic acids of the present invention. 

These cross -hybridizing nucleic acids can be used, inter alia, 
as probes for, and to drive expression of, proteins that are 
related to HTPL of the present invention as further isoforms, 
homologues, paralogues, or orthologues . 

10 In a first embodiment, the invention provides an 

isolated nucleic acid comprising a sequence that hybridizes 
under high stringency conditions to a probe the nucleotide 
sequence of which consists of at least 17 nt, 18, 19, 20, 21, 
22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO: 7 or the 

15 complement of SEQ ID NO: 7, wherein the isolated nucleic acid 
is no more than about 100 kb in length, typically no more than 
about 75 kb in length, and often no more than about 50 kb in 
length. Often, the isolated nucleic acids of this embodiment 
are no more than about 25 kb in length, often no more than 

20 about 15 kb in length, and frequently no more than about 10 kb 
in length. 

In a further embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 
under moderate stringency conditions to a probe the nucleotide 

25 sequence of which consists of at least 17 nt, 18, 19, 20, 21, 
22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO: 7 or the 
complement of SEQ ID NO: 7, wherein the isolated nucleic acid 
is no more than about 100 kb in length, typically no more than 
about 75 kb in length, and often no more than about 50 kb in 

30 length. Often, the isolated nucleic acids of this embodiment 
are no more than about 25 kb in length, often no more than 
about 15 kb in length, and frequently no more than about 10 kb 
in length. 



In a further embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 
under high stringency conditions to a hybridization probe the 
nucleotide sequence of which (i) encodes a polypeptide having 
the sequence of SEQ ID NO: 10, (ii) encodes a polypeptide 
having the sequence of SEQ ID NO: 10 with conservative amino 
acid substitutions, or (iii) is the complement of (i) or (ii) , 
wherein the isolated nucleic acid is no more than about 100 kb 
in length, typically no more than about 75 kb in length, and 
often no more than about 50 kb in length. Often, the isolated 
nucleic acids of this embodiment are no more than about 25 kb 
in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 
under high stringency conditions to a probe the nucleotide 
sequence of which consists of at least 17 nt, 18, 19, 20, 21, 
22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO: 11 or the 
complement of SEQ ID NO: 11, wherein the isolated nucleic acid 
is no more than about 100 kb in length, typically no more than 
about 75 kb in length, and often no more than about 50 kb in 
length. Often, the isolated nucleic acids of this embodiment 
are no more than about 25 kb in length, often no more than 
about 15 kb in length, and frequently no more than about 10 kb 
in length. 

In a further embodiment, the invention provides an isolated 
nucleic acid comprising a sequence that hybridizes under 
moderate stringency conditions to a probe the nucleotide 
sequence of which consists of at least 17 nt, 18, 19, 20, 21, 
22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO: 11 or the 
complement of SEQ ID NO: 11, wherein the isolated nucleic acid 
is no more than about 100 kb in length, typically no more than 
about 75 kb in length, and often no more than about 50 kb in 
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length. Often, the isolated nucleic acids of this embodiment 
are no more than about 25 kb in length, often no more than 
about 15 kb in length, and frequently no more than about 10 kb 
in length. 

In a further embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 
under high stringency conditions to a hybridization probe the 
nucleotide sequence of which (i) encodes a polypeptide having 
the sequence of SEQ ID NO: 14, (ii) encodes a polypeptide 
having the sequence of SEQ ID NO: 14 with conservative amino 
acid substitutions, or (iii) is the complement of (i) or (ii) , 
wherein the isolated nucleic acid is no more than about 100 kb 
in length, typically no more than about 75 kb in length, and 
often no more than about 50 kb in length. Often, the isolated 
nucleic acids of this embodiment are no more than about 25 kb 
in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 
under high stringency conditions to a probe the nucleotide 
sequence of which consists of at least 17 nt, 18, 19, 20, 21, 
22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO: 4796 or the 
complement of SEQ ID NO: 4796, wherein the isolated nucleic 
acid is no more than about 100 kb in length, typically no more 
than about 75 kb in length, and often no more than about 50 kb 
in length. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

In a further embodiment, the invention provides an isolated 
nucleic acid comprising a sequence that hybridizes under 
moderate stringency conditions to a probe the nucleotide 
sequence of which consists of at least 17 nt, 18, 19, 20, 21, 
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22, 22, 24, 25, 30, 40, or 50 nt of SEQ ID NO: 4796 or the 
complement of SEQ ID NO: 4796, wherein the isolated nucleic 
acid is no more than about 100 kb in length, typically no more 
than about 75 kb in length, and often no more than about 50 kb 
in length. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

In a further embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 
under high stringency conditions to a hybridization probe the 
nucleotide sequence of which (i) encodes a polypeptide having 
the sequence of SEQ ID NO: 4799, (ii) encodes a polypeptide 
having the sequence of SEQ ID NO: 4799 with conservative amino 
acid substitutions, or (iii) is the complement of (i) or (ii) , 
wherein the isolated nucleic acid is no more than about 100 kb 
in length, typically no more than about 75 kb in length, and 
often no more than about 50 kb in length. Often, the isolated 
nucleic acids of this embodiment are no more than about 25 kb 
in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 
under high stringency conditions to a probe the nucleotide 
sequence of which consists of at least 17 nt, 18, 19, 20, 21, 
22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO: 4800 or the 
complement of SEQ ID NO: 4800, wherein the isolated nucleic 
acid is no more than about 100 kb in length, typically no more 
than about 75 kb in length, and often no more than about 50 kb 
in length. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 
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In a further embodiment, the invention provides an isolated 
nucleic acid comprising a sequence that hybridizes under 
moderate stringency conditions to a probe the nucleotide 
sequence of which consists of at least 17 nt, 18, 19, 20, 21, 
22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO: 4800 or the 
complement of SEQ ID NO: 4800, wherein the isolated nucleic 
acid is no more than about 100 kb in length, typically no more 
than about 75 kb in length, and often no more than about 50 kb 
in length. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

Particularly Useful Nucleic Acids 

Particularly useful among the above -de scribed nucleic 
acids are those that are expressed, or the complement of which 
are expressed, in testis, as well as adrenal, adult and fetal 
liver, bone marrow, brain, kidney, lung, placenta, prostate, 
skeletal muscle and colon. 

Also particularly useful among the above -described 
nucleic acids are those that encode, or the complement of which 
encode, a membrane protein functioning in the hedgehog 
signaling pathway, mutations of which cause male sterility or 
cancer. 

Other particularly useful embodiments of the nucleic 
acids above-described are those that encode, or the complement 
of which encode, a polypeptide having a Patched motif, or 
having a Sterol -sensing domain, or having at least one 
transmembrane domain, especially those having a plurality of 
transmembrane domains, particularly those having twelve (for 
HTPL-L) or seven (for HTPL-S) transmembrane domains in tandem. 
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In another series of nucleic acid embodiments, the 
invention provides fragments of various of the isolated nucleic 
acids of the present invention which prove useful, inter alia, 
as nucleic acid probes, as amplification primers, and to direct 
expression or synthesis of epitopic or immunogenic protein 
fragments . 

In a first embodiment, the invention provides an 
isolated nucleic acid comprising at least 17 nucleotides, 18 
nucleotides., 20 nucleotides. 24 nucleotides, or 2 5 nucleotides 
of (i) SEQ ID NO: 7, (ii) a degenerate variant of SEQ ID NO: 9, 
or (iii) the complement of (i) or (ii) , wherein the isolated 
nucleic acid is no more than about 100 kb in length, typically 
no more than about 75 kb in length, more typically no more than 
about 50 kb in length. Often, the isolated nucleic acids of 
this embodiment are no more than about 25 kb in length, often 
no more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

The invention also provides an isolated nucleic acid 
comprising (i) a nucleotide sequence that encodes a peptide of 
at least 8 contiguous amino acids of SEQ ID NO: 10, (ii) a 
nucleotide sequence that encodes a peptide of at least 15 
contiguous amino acids of SEQ ID NO: 10, or (iii) the 
complement of (i) or (ii) , wherein the isolated nucleic acid is 
no more than about 100 kb in length, typically no more than 
about 75 kb in length, more typically no more than about 50 kb 
in length. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

The invention also provides an isolated nucleic acid 
comprising a nucleotide sequence that encodes (i) a polypeptide 
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having the sequence of at least 8 contiguous amino acids of SEQ 
ID NO: 10 with conservative amino acid substitutions, (ii) a 
polypeptide having the sequence of at least 15 contiguous amino 
acids of SEQ ID NO: 10 with conservative amino acid 
5 substitutions, (iii) a polypeptide having the sequence of at 

least 8 contiguous amino acids of SEQ ID NO: 10 with moderately 
conservative substitutions, (iv) a polypeptide having the 
sequence of at least 15 congiuous amino acids of SEQ ID NO: 10 
with moderately conservative substitutions, or (v) the 

L ck 10 complement of any of (i) - (iv) , wherein the isolated nucleic 
acid is no more than about 100 kb in length, typically no more 

n\ than about 75 kb in length, more typically no more than about 

] 50 kb in length. Often, the isolated nucleic acids of this 

w 

[rj embodiment are no more than about 25 kb in length, often no 

& 15 more than about 15 kb in length, and frequently no more than 
rj about 10 kb in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising at least 17 nucleotides, 18 
nucleotides, 20 nucleotides, 24 nucleotides, or 25 nucleotides 



U 



a 
ni 

20 of (i) SEQ ID NO: 11, (ii) a degenerate variant of SEQ ID NO: 



12, or (iii) the complement of (i) or (ii) , wherein the 
isolated nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, more typically no 
more than about 50 kb in length. Often, the isolated nucleic 

25 acids of this embodiment are no more than about 25 kb in 
length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

The invention also provides an isolated nucleic acid 
comprising (i) a nucleotide sequence that encodes a peptide of 

30 at least 8 contiguous amino acids of SEQ ID NO: 14, (ii) a 
nucleotide sequence that encodes a peptide of at least 15 
contiguous amino acids of SEQ ID NO: 14, or (iii) the 
complement of (i) or (ii) , wherein the isolated nucleic acid is 
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no more than about 100 kb in length, typically no more than 
about 75 kb in length, more typically no more than about 50 kb 
in length. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

The invention also provides an isolated nucleic acid 
comprising a nucleotide sequence that encodes (i) a polypeptide 
having the sequence of at least 8 contiguous amino acids of SEQ 
ID NO: 14 with conservative amino acid substitutions, (ii) a 
polypeptide having the sequence of at least 15 contiguous amino 
acids of SEQ ID NO: 14 with conservative amino acid 
substitutions, (iii) a polypeptide having the sequence of at 
least 8 contiguous amino acids of SEQ ID NO: 14 with moderately 
conservative substitutions, (iv) a polypeptide having the 
sequence of at least 15 contiguous amino acids of SEQ ID NO: 14 
with moderately conservative substitutions, or (v) the 
complement of any of (i) - (iv) , wherein the isolated nucleic 
acid is no more than about 100 kb in length, typically no more 
than about 75 kb in length, more typically no more than about 
50 kb in length. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising at least 17 nucleotides, 18 
nucleotides, 20 nucleotides, 24 nucleotides, or 25 nucleotides 
of (i) SEQ ID NO: 4796, (ii) a degenerate variant of SEQ ID NO: 
4798, or (iii) the complement of (i) or (ii) , wherein the 
isolated nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, more typically no 
more than about 50 kb in length. Often, the isolated nucleic 
acids of this embodiment are no more than about 2 5 kb in 
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length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

The invention also provides an isolated nucleic acid 
comprising (i) a nucleotide sequence that encodes a peptide of 
at least 8 contiguous amino acids of SEQ ID NO: 4799, (ii) a 
nucleotide sequence that encodes a peptide of at least 15 
contiguous amino acids of SEQ ID NO: 4 799, or (iii) the 
complement of (i) or (ii) , wherein the isolated nucleic acid is 
no more than about 100 kb in length, typically no more than 
about 75 kb in length, more typically no more than about 50 kb 
in length. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

The invention also provides an isolated nucleic acid 
comprising a nucleotide sequence that encodes (i) a polypeptide 
having the sequence of at least 8 contiguous amino acids of SEQ 
ID NO: 4799 with conservative amino acid substitutions, (ii) a 
polypeptide having the sequence of at least 15 contiguous amino 
acids of SEQ ID NO: 4799 with conservative amino acid 
substitutions, (iii) a polypeptide having the sequence of at 
least 8 contiguous amino acids of SEQ ID NO: 4799 with 
moderately conservative substitutions, (iv) a polypeptide 
having the sequence of at least 15 contiguous amino acids of 
SEQ ID NO: 4799 with moderately conservative substitutions, or 
(v) the complement of any of (i) - (iv) , wherein the isolated 
nucleic acid is no more than about 100 kb in length, typically 
no more than about 75 kb in length, more typically no more than 
about 50 kb in length. Often, the isolated nucleic acids of 
this embodiment are no more than about 2 5 kb in length, often 
no more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 
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In another embodiment, the invention provides an 
isolated nucleic acid comprising at least 17 nucleotides, 18 
nucleotides, 20 nucleotides, 24 nucleotides, or 25 nucleotides 
of (i) SEQ ID NO: 4800, or (ii) the complement of (i) , wherein 
5 the isolated nucleic acid is no more than about 100 kb in 
length, typically no more than about 75 kb in length, more 
typically no more than about 50 kb in length. Often, the 
isolated nucleic acids of this embodiment are no more than 
about 25 kb in length, often no more than about 15 kb in 
10 length, and frequently no more than about 10 kb in length. 

Single Exon Probes 

The invention further provides genome -derived single 

15 exon probes having portions of no more than one exon of the 
HTPL gene. As further described in commonly owned and 
copending U.S. patent application serial no. 09/632,366, filed 
August 3, 2000 ("Methods and Apparatus for High Throughput 
Detection and Characterization of alternatively Spliced 

20 Genes") , the disclosure of which is incorporated herein by 
reference in its entirety, such single exon probes have 
particular utility in identifying and characterizing splice 
variants. In particular, such single exon probes are useful 
for identifying and discriminating the expression of distinct 

25 isoforms of HTPL. 

In a first embodiment, the invention provides an 
isolated nucleic acid comprising a nucleotide sequence of no 
more than one portion of SEQ ID NOs : 15 - 18 or the complement 
of SEQ ID NOs: 15 - 18, wherein the portion comprises at least 

30 17 contiguous nucleotides, 18 contiguous nucleotides, 20 
contiguous nucleotides, 24 contiguous nucleotides, 25 
contiguous nucleotides, or 50 contiguous nucleotides of any one 
of SEQ ID NOs: 15 - 18, or their complement. In a further 



embodiment, the exonic portion comprises the entirety of the 
referenced SEQ ID NO: or its complement. 

In other embodiments, the invention provides isolated 
single exon probes having the nucleotide sequence of any one of 
5 SEQ ID NOs: 19-22. 

Transcription Control Nucleic Acids 

In another aspect, the present invention provides 

10 genome -derived isolated nucleic acids that include nucleic acid 
sequence elements that control transcription of the HTPL gene. 

These nucleic acids can be used, inter alia, to drive 
expression of heterologous coding regions in recombinant 
constructs, thus conferring upon such heterologous coding 

15 regions the expression pattern of the native HTPL gene. These 
nucleic acids can also be used, conversely, to target 
heterologous transcription control elements to the HTPL genomic 
locus, altering the expression pattern of the HTPL gene itself. 

In a first such embodiment, the invention provides an 

20 isolated nucleic acid comprising the nucleotide sequence of SEQ 
ID NO: 23 or its complement, wherein the isolated nucleic acid 
is no more than about 100 kb in length, typically no more than 
about 75 kb in length, more typically no more than about 50 kb 
in length. Often, the isolated nucleic acids of this 

25 embodiment are no more than about 2 5 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising at least 17, 18, 20, 24, or 25 

30 nucleotides of the sequence of SEQ ID NO: 23 or its complement, 
wherein the isolated nucleic acid is no more than about 100 kb 
in length, typically no more than about 75 kb in length, more 
typically no more than about 50 kb in length. Often, the 



isolated nucleic acids of this embodiment are no more than 
about 25 kb in length, often no more than about 15 kb in 
length, and frequently no more than about 10 kb in length. 



5 VECTORS AND HOST CELLS 

In another aspect, the present invention provides 
vectors that comprise one or more of the isolated nucleic acids 
of the present invention, and host cells in which such vectors 
M 10 have been introduced. 

IT, The vectors can be used, inter alia, for propagating 

the nucleic acids of the present invention in host cells 
t[j (cloning vectors) , for shuttling the nucleic acids of the 

^ present invention between host cells derived from disparate 

s 15 organisms (shuttle vectors) , for inserting the nucleic acids of 
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the present invention into host cell chromosomes (insertion 
h| vectors) , for expressing sense or antisense RNA transcripts of 

the nucleic acids of the present invention in vitro or within a 
host cell, and for expressing polypeptides encoded by the 
20 nucleic acids of the present invention, alone or as fusions to 
heterologous polypeptides. Vectors of the present invention 
will often be suitable for several such uses. 

Vectors are by now well-known in the art, and are 
described, inter alia, in Jones et al . (eds.), Vectors : Cloning 
25 Applications : Essential Techniques (Essential Techniques 

Series), John Wiley & Son Ltd 1998 (ISBN: 047196266X) ; Jones et 
al. (eds.), Vectors: Expression Systems : Essential Techniques 
(Essential Techniques Series) , John Wiley & Son Ltd, 1998 
(ISBN: 0471962678) ; Gacesa et al., Vectors: Essential Data , John 
30 Wiley & Sons, 1995 (ISBN: 0471948411); Cid-Arregui (eds.), 

Viral Vectors: Basic Science and Gene Therapy , Eaton Publishing 
Co., 2000 (ISBN: 188129935X) ; Sambrook et al . , Molecular 
Cloning: A Laboratory Manual (3 rd ed.), Cold Spring Harbor 
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Laboratory Press, 2001 (ISBN: 0879695773); Ausubel et al . 

(eds.), Short Protocols in Molecular Biology: A Compendium of 
Methods from Current Protocols in Molecular Biology (4 th ed.), 
John Wiley & Sons, 1999 (ISBN: 047132938X) , the disclosures of 
which are incorporated herein by reference in their entireties. 

Furthermore, an enormous variety of vectors are available 
commercially. Use of existing vectors and modifications 
thereof being well within the skill in the art, only basic 
features need be described here. 

Typically, vectors are derived from virus, plasmid, 
prokaryotic or eukaryotic chromosomal elements, or some 
combination thereof, and include at least one origin of 
replication, at least one site for insertion of heterologous 
nucleic acid, typically in the form of a polylinker with 
multiple, tightly clustered, single cutting restriction sites, 
and at least one selectable marker, although some integrative 
vectors will lack an origin that is functional in the host to 
be chromosomally modified, and some vectors will lack 
selectable markers. Vectors of the present invention will 
further include at least one nucleic acid of the present 
invention inserted into the vector in at least one location. 

Where present, the origin of replication and 
selectable markers are chosen based upon the desired host cell 
or host cells; the host cells, in turn, are selected based upon 
the desired application. 

For example, prokaryotic cells, typically E. coli, 
are typically chosen for cloning. In such case, vector 
replication is predicated on the replication strategies of 
coliform- infecting phage - such as phage lambda, M13, T7, T3 
and PI — or on the replication origin of autonomously 
replicating episomes, notably the ColEl plasmid and later 
derivatives, including pBR322 and the pUC series plasmids . 
Where E. coli is used as host, selectable markers are, 
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analogously, chosen for selectivity in gram negative bacteria: 
e -9"»/ typical markers confer resistance to antibiotics, such as 
ampicillin, tetracycline, chloramphenicol, kanamycin, 
streptomycin, zeocin; auxotrophic markers can also be used. 

As another example, yeast cells, typically S. 
cerevisiae, are chosen, inter alia, for eukaryotic genetic 
studies, due to the ease of targeting genetic changes by 
homologous recombination and to the ready ability to complement 
genetic defects using recombinant ly expressed proteins, for 
identification of interacting protein components, e.g. through 
use of a two-hybrid system, and for protein expression. 
Vectors of the present invention for use in yeast will 
typically, but not invariably, contain an origin of replication 
suitable for use in yeast and a selectable marker that is 
functional in yeast. 

Integrative Yip vectors do not replicate 
autonomously, but integrate, typically in single copy, into the 
yeast genome at low frequencies and thus replicate as part of 
the host cell chromosome; these vectors lack an origin of 
replication that is functional in yeast, although they 
typically have at least one origin of replication suitable for 
propagation of the vector in bacterial cells. YEp vectors, in 
contrast, replicate episomally and autonomously due to presence 
of the yeast 2 micron plasmid origin (2 urn ori) . The YCp yeast 
centromere plasmid vectors are autonomously replicating vectors 
containing centromere sequences, CEN, and autonomously 
replicating sequences, ARS; the ARS sequences are believed to 
correspond to the natural replication origins of yeast 
chromosomes. YACs are based on yeast linear plasmids, denoted 
YLp, containing homologous or heterologous DNA sequences that 
function as telomeres (TEL) in vivo, as well as containing 
yeast ARS (origins of replication) and CEN (centromeres) 
segments . 
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Selectable markers in yeast vectors include a variety 
of auxotrophic markers, the most common of which are (in 
Saccharomyces cerevisiae) URA3 , HIS3, LEU2 , TRP1 and LYS2 # 
which complement specific auxotrophic mutations, such as 
ura3-52, his3-Dl, Ieu2-Dl, trpl-Dl and lys2-201. The URA3 and 
LYS2 yeast genes further permit negative selection based on 
specific inhibitors, 5-f luoro-orotic acid (FOA) and 
a-aminoadipic acid (aAA) , respectively, that prevent growth of 
the prototrophic strains but allows growth of the ura3 and lys2 
mutants, respectively. Other selectable markers confer 
resistance to, e.g., zeocin. 

As yet another example, insect cells are often chosen 
for high efficiency protein expression. Where the host cells 
are from Spodoptera frugiperda - e.g., Sf9 and Sf21 cell lines, 

TM 

and expresSF cells (Protein Sciences Corp., Meriden, CT, 
USA) - the vector replicative strategy is typically based upon 
the baculovirus life cycle. Typically, baculovirus transfer 
vectors are used to replace the wild-type AcMNPV polyhedrin 
gene with a heterologous gene of interest. Sequences that 
flank the polyhedrin gene in the wild-type genome are 
positioned 5 1 and 3 ■ of the expression cassette on the transfer 
vectors. Following cotransf ection with AcMNPV DNA, a 
homologous recombination event occurs between these sequences 
resulting in a recombinant virus carrying the gene of interest 
and the polyhedrin or plO promoter. Selection can be based 
upon visual screening for lacZ fusion activity. 

As yet another example, mammalian cells are often 
chosen for expression of proteins intended as pharmaceutical 
agents, and are also chosen as host cells for screening of 
potential agonist and antagonists of a protein or a 
physiological pathway. 

Where mammalian cells are chosen as host cells, 
vectors intended for autonomous extrachromosomal replication 
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will typically include a viral origin, such as the SV40 origin 
(for replication in cell lines expressing the large T-antigen, 
such as C0S1 and C0S7 cells) , the papillomavirus origin, or the 
EBV origin for long term episomal replication (for use, e.g., 
in 2 93-EBNA cells, which const itutively express the EBV EBNA-1 
gene product and adenovirus E1A) . Vectors intended for 
integration, and thus replication as part of the mammalian 
chromosome, can, but need not, include an origin of replication 
functional in mammalian cells, such as the SV40 origin. 
Vectors based upon viruses, such as adenovirus, adeno- 
associated virus, vaccinia virus, and various mammalian 
retroviruses, will typically replicate according to the viral 
replicative strategy. 

Selectable markers for use in mammalian cells include 
resistance to neomycin (G418) , blasticidin, hygromycin and to 
zeocin, and selection based upon the purine salvage pathway 
using HAT medium. 

Plant cells can also be used for expression, with 
the vector replicon typically derived from a plant virus (e.g., 
cauliflower mosaic virus, CaMV; tobacco mosaic virus, TMV) and 
selectable markers chosen for suitability in plants. 

For propagation of nucleic acids of the present 
invention that are larger than can readily be accomodated in 
vectors derived from plasmids or virus, the invention further 
provides artificial chromosomes — BACs, YACs, PACs, and HACs — 
that comprise HTPL nucleic acids, often genomic nucleic acids. 

The BAC system is based on the well -characterized E. 
coli F- factor, a low copy plasmid that exists in a supercoiled 
circular form in host cells. The structural features of the 
F- factor allow stable maintenance of individual human DNA 
clones as well as easy manipulation of the cloned DNA. See 
Shizuya et al . , Keio J. Med. 50(l):26-30 (2001); Shizuya et 
al. t Proc. Natl. Acad. Sci . USA 89 (18) : 8794-7 (1992). 
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YACs are based on yeast linear plasmids, denoted YLp , 
containing homologous or heterologous DNA sequences that 
function as telomeres (TEL) in vivo, as well as containing 
yeast ARS (origins of replication) and CEN (centromeres) 
5 segments. 

HACs are human artifical chromosomes. Kuroiwa et 
al., Nature Biotechnol. 18 (10) : 1086-90 (2000); Henning et al., 
Proc. Natl. Acad. Sci. USA 96(2):592-7 (1999); Harrington et 
al., Nature Genet. 15(4):345-55 (1997). In one version, long 
10 synthetic arrays of alpha satellite DNA are combined with 
telomeric DNA and genomic DNA to generate linear 
microchromosomes that are mitotically and cytogenetically 
stable in the absence of selection. 

PACs are Pl-derived artificial chromosomes. 
15 Sternberg, Proc. Natl. Acad. Sci. USA 87(l):103-7 (1990); 

Sternberg et al . , New Biol. 2(2):151-62 (1990); Pierce et al . y 
Proc. Natl Acad. Sci. USA 89 (6) : 2056-60 (1992). 

Vectors of the present invention will also often 
include elements that permit in vitro transcription of RNA from 
20 the inserted heterologous nucleic acid. Such vectors 

typically include a phage promoter, such as that from T7, T3, 
or SP6 , flanking the nucleic acid insert. Often two different 
such promoters flank the inserted nucleic acid, permitting 
separate in vitro production of both sense and antisense 
25 strands . 

Expression vectors of the present invention — that 
is, those vectors that will drive expression of polypeptides 
from the inserted heterologous nucleic acid — will often 
include a variety of other genetic elements operatively linked 
30 to the protein-encoding heterologous nucleic acid insert, 

typically genetic elements that drive transcription, such as 
promoters and enhancer elements, those that facilitate RNA 
processing, such as transcription termination and/or 



polyadenylation signals, and those that facilitate translation, 
such as ribosomal consensus sequences. 

For example, vectors for expressing proteins of the 
present invention in prokaryotic cells, typically E. coli, will 
include a promoter, often a phage promoter, such as phage 
lambda pL promoter, the trc promoter, a hybrid derived from the 
trp and lac promoters, the bacteriophage T7 promoter (in E. 
coli cells engineered to express the T7 polymerase) , or the 
araBAD operon. Often, such prokaryotic expression vectors will 
further include transcription terminators, such as the aspA 
terminator, and elements that facilitate translation, such as a 
consensus ribosome binding site and translation termination 
codon, Schomer et al . , Proc. Natl. Acad. Sci . USA 83:8506-8510 
(1986) . 

As another example, vectors for expressing proteins 
of the present invention in yeast cells, typically S. 
cerevisiae, will include a yeast promoter, such as the CYC1 
promoter, the GAL1 promoter, ADH1 promoter, or the GPD 
promoter, and will typically have elements that facilitate 
transcription termination, such as the transcription 
termination signals from the CYC1 or ADH1 gene. 

As another example, vectors for expressing proteins 
of the present invention in mammalian cells will include a 
promoter active in mammalian cells. Such promoters are often 
drawn from mammalian viruses - such as the enhancer-promoter 
sequences from the immediate early gene of the human 
cytomegalovirus (CMV) , the enhancer -promoter sequences from the 
Rous sarcoma virus long terminal repeat (RSV LTR) , and the 
enhancer-promoter from SV40. Often, expression is enhanced by 
incorporation of polyadenylation sites, such as the late SV40 
polyadenylation site and the polyadenylation signal and 
transcription termination sequences from the bovine growth 
hormone (BGH) gene, and ribosome binding sites. Furthermore, 
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vectors can include introns, such as intron II of rabbit 
(3-globin gene and the SV40 splice elements. 

Vector-drive protein expression can be constitutive 
or inducible. 

5 Inducible vectors include either naturally inducible 

promoters, such as the trc promoter, which is regulated by the 
lac operon, and the pL promoter, which is regulated by 
tryptophan, the MMTV-LTR promoter, which is inducible by 
dexamethasone, or can contain synthetic promoters and/or 

10 additional elements that confer inducible control on adjacent 
promoters. Examples of inducible synthetic promoters are the 
hybrid Plac/ara-1 promoter and the PLtetO-1 promoter. The 
PltetO-1 promoter takes advantage of the high expression levels 
from the PL promoter of phage lambda, but replaces the lambda 

15 repressor sites with two copies of operator 2 of the TnlO 
tetracycline resistance operon, causing this promoter to be 
tightly repressed by the Tet repressor protein and induced in 
response to tetracycline (Tc) and Tc derivatives such as 
anhydrotetracycline . 

20 As another example of inducible elements, hormone 

response elements, such as the glucocorticoid response element 
(GRE) and the estrogen response element (ERE) , can confer 
hormone inducibility where vectors are used for expression in 
cells having the respective hormone receptors. To reduce 

25 background levels of expression, elements responsive to 
ecdysone, an insect hormone, can be used instead, with 
coexpression of the ecdysone receptor. 

Expression vectors can be designed to fuse the 
expressed polypeptide to small protein tags that facilitate 

30 purification and/or visualization. 

For example, proteins of the present invention can be 
expressed with a polyhistidine tag that facilitates 
purification of the fusion protein by immobilized metal 
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affinity chromatography, for example using NiNTA resin (Qiagen 

TM 

Inc., Valencia, CA, USA) or TALON " resin (cobalt immobilized 
affinity chromatography medium, Clontech Labs, Palo Alto, CA, 
USA) . As another example, the fusion protein can include a 
5 chitin-binding tag and self -excising intein, permitting chitin- 

TM 

based purification with self -removal of the fused tag (IMPACT 
system, New England Biolabs, Inc., Beverley, MA, USA). 
Alternatively, the fusion protein can include a 
£ calmodulin-binding peptide tag, permitting purification by 

calmodulin affinity resin (Stratagene, La Jolla, CA, USA) , or a 
specifically excisable fragment of the biotin carboxylase 
SJ carrier protein, permitting purification of in vivo 

l SI 

|p biotinylated protein using an avidin resin and subsequent tag 

= removal (Promega, Madison, WI, USA). As another useful 

H 15 alternative, the proteins of the present invention can be 
UJ expressed as a fusion to glutathione-S-transf erase , the 

affinity and specificity of binding to glutathione permitting 
purification using glutathione affinity resins, such as 
Glutathione- Super f low Resin (Clontech Laboratories, Palo Alto, 
20 CA, USA), with subsequent elution with free glutathione. 

Other tags include, for example, the Xpress epitope, 
detectable by anti-Xpress antibody (Invitrogen, Carlsbad, CA, 
USA) , a myc tag, detectable by anti-myc tag antibody, the V5 
epitope, detectable by anti-V5 antibody (Invitrogen, Carlsbad, 
25 CA, USA), FLAG® epitope, detectable by anti-FLAG antibody 
(Stratagene, La Jolla, CA, USA), and the HA epitope. 

For secretion of expressed proteins, vectors can 
include appropriate sequences that encode secretion signals, 
such as leader peptides. For example, the pSecTag2 vectors 
30 (Invitrogen, Carlsbad, CA, USA) are 5.2 kb mammalian expression 
vectors that carry the secretion signal from the V-J2-C region 
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of the mouse Ig kappa-chain for efficient secretion of 
recombinant proteins from a variety of mammalian cell lines. 

Expression vectors can also be designed to fuse 
proteins encoded by the heterologous nucleic acid insert to 
polypeptides larger than purification and/or identification 
tags. Useful protein fusions include those that permit display 
of the encoded protein on the surface of a phage or cell, 
fusions to intrinsically fluorescent proteins, such as those 
that have a green fluorescent protein (GFP) -like chromophore, 
fusions to the IgG Fc region, and fusions for use in two hybrid 
systems . 

Vectors for phage display fuse the encoded 
polypeptide to, e.g., the gene III protein (pill) or gene VIII 
protein (pVIII) for display on the surface of filamentous 
phage, such as M13 . See Barbas et al . , Phage Display: A 
Laboratory Manual , Cold Spring Harbor Laboratory Press (2001) 
(ISBN 0-87969-546-3); Kay et al . (eds.), Phage Display of 
Peptides and Proteins: A Laboratory Manual , San Diego: Academic 
Press, Inc., 1996; Abelson et al . (eds.), Combinatorial 
Chemistry , Methods in Enzymology vol. 267, Academic Press (May 
1996) . 

Vectors for yeast display, e.g. the pYDl yeast 
display vector (Invitrogen, Carlsbad, CA, USA) , use the 
a-agglutinin yeast adhesion receptor to display recombinant 
protein on the surface of S. cerevisiae. Vectors for mammalian 
display, e.g., the pDisplay™ vector (Invitrogen, Carlsbad, CA, 
USA) , target recombinant proteins using an N- terminal cell 
surface targeting signal and a C- terminal transmembrane 
anchoring domain of platelet derived growth factor receptor. 

A wide variety of vectors now exist that fuse 
proteins encoded by heterologous nucleic acids to the 
chromophore of the substrate- independent , intrinsically 
fluorescent green fluorescent protein from Aeguorea victoria 
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( "GFP" ) and its variants. These proteins are intrinsically 
fluorescent: the GFP- like chromophore is entirely encoded by 
its amino acid sequence and can fluoresce without requirement 
for cof actor or substrate. 

Structurally, the GFP- like chromophore comprises an 
11-stranded (3-barrel (p-can) with a central a-helix, the 
central a-helix having a conjugated n- resonance system that 
includes two aromatic ring systems and the bridge between them. 

The 7c-resonance system is created by autocatalytic cyclization 
among amino acids; cyclization proceeds through an 
imidazolinone intermediate, with subsequent dehydrogenation by 
molecular oxygen at the Ca-cp bond of a participating tyrosine. 

The GFP- like chromophore can be selected from GFP- 
like chromophores found in naturally occurring proteins, such 
as A. victoria GFP (GenBank accession number AAA27721) , Renilla 
reniformis GFP, FP583 (GenBank accession no. AF168419) (DsRed) , 
FP593 (AF272711) , FP483 (AF168420) , FP484 (AF168424), FP595 
(AF246709) , FP486 (AF168421) , FP538 (AF168423), andFP506 
(AF168422) , and need include only so much of the native protein 
as is needed to retain the chromophore ' s intrinsic 
fluorescence. Methods for determining the minimal domain 
required for fluorescence are known in the art. Li et 
al ., "Deletions of the Aeguorea victoria Green Fluorescent 
Protein Define the Minimal Domain Required for Fluorescence, " 
J. Biol. Chem. 272:28545-28549 (1997). 

Alternatively, the GFP- like chromophore can be 
selected from GFP- like chromophores modified from those found 
in nature. Typically, such modifications are made to improve 
recombinant production in heterologous expression systems (with 
or without change in protein sequence) , to alter the excitation 
and/or emission spectra of the native protein, to facilitate 
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purification, to facilitate or as a consequence of cloning, or 
are a fortuitous consequence of research investigation. 

The methods for engineering such modified GFP-like 
chromophores and testing them for fluorescence activity, both 
5 alone and as part of protein fusions, are well-known in the 
art. Early results of these efforts are reviewed in Heim et 
al., Curr, Biol. 6:178-182 (1996), incorporated herein by 
reference in its entirety; a more recent review, with 
tabulation of useful mutations, is found in Palm et al . , 

10 "Spectral Variants of Green Fluorescent Protein," in Green 
Fluorescent Proteins , Conn (ed.), Methods Enzymol. vol. 302, 
pp. 378 - 3 94 (1999) , incorporated herein by reference in its 
entirety. A variety of such modified chromophores are now 
commercially available and can readily be used in the fusion 

15 proteins of the present invention. 

For example, EGFP ("enhanced GFP" ) , Cormack et al . , 
Gene 173:33-38 (1996); U.S. Pat. Nos . 6,090,919 and 5,804,387, 
is a red- shifted, human codon-optimized variant of GFP that has 
been engineered for brighter fluorescence, higher expression in 

20 mammalian cells, and for an excitation spectrum optimized for 
use in flow cytometers. EGFP can usefully contribute a GFP- 
like chromophore to the fusion proteins of the present 
invention. A variety of EGFP vectors, both plasmid and viral, 
are available commercially (Clontech Labs, Palo Alto, CA, USA) , 

25 including vectors for bacterial expression, vectors for N- 

terminal protein fusion expression, vectors for expression of 
C- terminal protein fusions, and for bicistronic expression. 

Toward the other end of the emission spectrum, EBFP 
("enhanced blue fluorescent protein") and BFP2 contain four 

30 amino acid substitutions that shift the emission from green to 
blue, enhance the brightness of fluorescence and improve 
solubility of the protein, Heim et al . , Curr. Biol. 6:178-182 
(1996); Cormack et al . , Gene 173:33-38 (1996). EBFP is 



optimized for expression in mammalian cells whereas BFP2 , which 
retains the original jellyfish codons, can be expressed in 
bacteria; as is further discussed below, the host cell of 
production does not affect the utility of the resulting fusion 
5 protein. The GFP-like chromophores from EBFP and BFP2 can 
usefully be included in the fusion proteins of the present 
invention, and vectors containing these blue- shifted variants 
are available from Clontech Labs (Palo Alto, CA, USA) . 

Analogously, EYFP ("enhanced yellow fluorescent 

10 protein") , also available from Clontech Labs, contains four 
amino acid substitutions, different from EBFP, Ormo et al . , 
Science 273:1392-1395 (1996), that shift the emission from 
green to yellowish-green. Citrine, an improved yellow 
fluorescent protein mutant, is described in Heikal et al., 

15 Proc. Natl. Acad. Sci. USA 97:11996-12001 (2000). ECFP 

("enhanced cyan fluorescent protein") (Clontech Labs, Palo 
Alto, CA, USA) contains six amino acid substitutions, one of 
which shifts the emission spectrum from green to cyan. Heim et 
al., Curr. Biol. 6:178-182 (1996); Miyawaki et al . , Nature 

20 388:882-887 (1997). The GFP-like chromophore of each of these 
GFP variants can usefully be included in the fusion proteins of 
the present invention. 

The GFP-like chromophore can also be drawn from other 
modified GFPs, including those described in U.S. Pat. Nos. 

25 6,124,128; 6,096,865; 6,090,919; 6,066,476; 6,054,321; 
6,027,881; 5,968,750; 5,874,304; 5,804,387; 5,777,079; 
5,741,668; and 5,625,048, the disclosures of which are 
incorporated herein by reference in their entireties. See also 
Conn (ed.), Green Fluorescent Protein , Methods in Enzymol. Vol. 

30 302, pp 378-394 (1999), incorporated herein by reference in its 
entirety. A variety of such modified chromophores are now 
commercially available and can readily be used in the fusion 
proteins of the present invention. 
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Fusions to the IgG Fc region increase serum half life 
of protein pharmaceutical products through interaction with the 
FcRn receptor (also denominated the FcRp receptor and the 
Brambell receptor, FcRb) , further described in international 
patent application nos. WO 97/43316, WO 97/34631, WO 96/32478, 
WO 96/18412. 

For long-term, high-yield recombinant production of 
the proteins, protein fusions, and protein fragments of the 
present invention, stable expression is particularly useful. 

Stable expression is readily achieved by integration 
into the host cell genome of vectors having selectable markers, 
followed by selection for integrants. 

For example, the pUB6/V5-His A, B, and C vectors 
(Invitrogen, Carlsbad, CA, USA) are designed for high-level 
stable expression of heterologous proteins in a wide range of 
mammalian tissue types and cell lines. pUB6/V5-His uses the 
promoter/enhancer sequence from the human ubiquitin C gene to 
drive expression of recombinant proteins: expression levels in 
293, CHO, and NIH3T3 cells are comparable to levels from the 
CMV and human EF-la promoters. The bsd gene permits rapid 
selection of stably transfected mammalian cells with the potent 
antibiotic blasticidin. 

Replication incompetent retroviral vectors, typically 
derived from Moloney murine leukemia virus, prove particularly 
useful for creating stable transf ectants having integrated 
provirus. The highly efficient transduction machinery of 
retroviruses, coupled with the availability of a variety of 
packaging cell lines - such as RetroPack PT 67, EcoPack2 - 
293, AmphoPack-2 93, GP2-293 cell lines (all available from 
Clontech Laboratories, Palo Alto, CA, USA) - allow a wide host 
range to be infected with high efficiency; varying the 
multiplicity of infection readily adjusts the copy number of 
the integrated provirus. Retroviral vectors are available with 
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a variety of selectable markers, such as resistance to 
neomycin, hygromycin, and puromycin, permitting ready selection 
of stable integrants. 

The present invention further includes host cells 
comprising the vectors of the present invention, either present 
episomally within the cell or integrated, in whole or in part, 
into the host cell chromosome. 

Among other considerations, some of which are 
described above, a host cell strain may be chosen for its 
ability to process the expressed protein in the desired 
fashion. Such post- translational modifications of the 
polypeptide include, but are not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation, and 
acylation, and it is an aspect of the present invention to 
provide HTPL proteins with such post- translational 
modifications . 

As noted earlier, host cells can be prokaryotic or 
eukaryotic. Representative examples of appropriate host cells 
include, but are not limited to, bacterial cells, such as E. 
coli, Caulobacter crescentus, Streptomyces species, and 
Salmonella typhimurium; yeast cells, such as Saccharomyces 
cerevisiae, Schizosaccharomyces pombe, Pichia pas tori s, Pichia 
methanol ica; insect cell lines, such as those from Spodoptera 

TM 

frugiperda - e.g., Sf9 and Sf21 cell lines, and expresSF 
cells (Protein Sciences Corp., Meriden, CT, USA) - Drosophila 
S2 cells, and Trichoplusia ni High Five® Cells (Invitrogen, 
Carlsbad, CA, USA); and mammalian cells. Typical mammalian 
cells include C0S1 and C0S7 cells, Chinese hamster ovary (CHO) 
cells, NIH 3T3 cells, 293 cells, HEPG2 cells, HeLa cells, 
L cells, murine ES cell lines (e.g., from strains 129/SV, 
C57/BL6, DBA- 1 , 129/SVJ) , K562, Jurkat cells, andBW5147. 
Other mammalian cell lines are well known and readily available 
from the American Type Culture Collection (ATCC) (Manassas, VA, 
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USA) and the National Institute of General medical Sciences 
(NIGMS) Human Genetic Cell Repository at the Coriell Cell 
Repositories (Camden, NJ, USA) . 

Methods for introducing the vectors and nucleic acids 
of the present invention into the host cells are well known in 
the art; the choice of technique will depend primarily upon the 
specific vector to be introduced and the host cell chosen. 

For example, phage lambda vectors will typically be 
packaged using a packaging extract {e.g., Gigapack® packaging 
extract, Stratagene, La Jolla, CA, USA) , and the packaged virus 
used to infect E. coli . Plasmid vectors will typically be 
introduced into chemically competent or elect rocompetent 
bacterial cells. 

E. coli cells can be rendered chemically competent by 
treatment, e.g., with CaCl 2 , or a solution of Mg 2+ , Mn 2+ , Ca 2+ , 
Rb + or K + , dimethyl sulfoxide, dithiothreitol , and hexamine 
cobalt (III), Hanahan, J. Mol. Biol. 166 (4) : 557-80 (1983), and 
vectors introduced by heat shock. A wide variety of chemically 
competent strains are also available commercially (e.g., 
Epicurian Coli® XLIO-Gold® Ultracompetent Cells (Stratagene, La 
Jolla, CA, USA) ; DH5a competent cells (Clontech Laboratories, 
Palo Alto, CA, USA); TOP10 Chemically Competent E. coli Kit 
(Invitrogen, Carlsbad, CA, USA) ) . 

Bacterial cells can be rendered electrocompetent — 
that is, competent to take up exogenous DNA by 

electroporation — by various pre -pulse treatments; vectors are 
introduced by electroporation followed by subsequent outgrowth 
in selected media. An extensive series of protocols is 
provided online in Electroprotocols (BioRad, Richmond, CA, USA) 
(http: //www.bio-rad. com/Lif eScience/pdf /New_Gene_Pulser .pdf ) . 

Vectors can be introduced into yeast cells by 
spheroplasting, treatment with lithium salts, electroporation, 
or protoplast fusion. 
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Spheroplasts are prepared by the action of hydrolytic 
enzymes — a snail -gut extract, usually denoted Glusulase, or 
Zymolyase, an enzyme from ArthroJbacter luteus — to remove 
portions of the cell wall in the presence of osmotic 
stabilizers, typically 1 M sorbitol. DNA is added to the 
spheroplasts, and the mixture is co-precipitated with a 
solution of polyethylene glycol (PEG) and Ca 2+ . Subsequently, 
the cells are resuspended in a solution of sorbitol, mixed with 
molten agar and then layered on the surface of a selective 
plate containing sorbitol. For lithium-mediated 
transformation, yeast cells are treated with lithium acetate, 
which apparently permeabilizes the cell wall, DNA is added and 
the cells are co-precipitated with PEG. The cells are exposed 
to a brief heat shock, washed free of PEG and lithium acetate, 
and subsequently spread on plates containing ordinary selective 
medium. Increased frequencies of transformation are obtained 
by using specially-prepared single -stranded carrier DNA and 
certain organic solvents. Schiestl et al . , Curr. Genet. 
16 (5-6) :339-46 (1989). For electroporat ion, freshly-grown 
yeast cultures are typically washed, suspended in an osmotic 
protectant, such as sorbitol, mixed with DNA, and the cell 
suspension pulsed in an electroporation device. Subsequently, 
the cells are spread on the surface of plates containing 
selective media. Becker et al . , Methods Enzymol . 194:182-7 
(1991) . The efficiency of transformation by electroporation 
can be increased over 100-fold by using PEG, single-stranded 
carrier DNA and cells that are in late log-phase of growth. 
Larger constructs, such as YACs, can be introduced by 
protoplast fusion. 

Mammalian and insect cells can be directly infected 
by packaged viral vectors, or transfected by chemical or 
electrical means . 
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For chemical transf ection, DNA can be coprecipitated 
with CaP0 4 or introduced using liposomal and nonliposomal 
lipid-based agents. Commercial kits are available for CaP0 4 
transf ection (CalPhos™ Mammalian Transf ection Kit, Clontech 
Laboratories, Palo Alto, CA, USA) , and lipid-mediated 
transf ection can be practiced using commercial reagents, such 
as LIPOFECTAMINE™ 2 000, LIPOFECTAMINE™ Reagent, CELLFECTIN® 
Reagent, and LIPOFECTIN® Reagent (Invitrogen, Carlsbad, CA, 
USA), DOTAP Liposomal Transf ection Reagent, FuGENE 6, 
X-tremeGENE 02. DOSPER, (Roche Molecular Biochemicals , 

TM ® _ ® 

Indianapolis, IN USA), Effectene , PolyFect , Superfect 
(Qiagen, Inc., Valencia, CA, USA) . Protocols for 
electroporating mammalian cells can be found online in 
Electroprotocols (Bio-Rad, Richmond, CA, USA) (http://www.bio- 
rad.com/LifeScience/pdf/New_Gene_Pulser.pdf) . 
See also, Norton et al . (eds.), Gene Transfer Methods: 
Introducing DNA into Living Cells and Organisms , BioTechniques 
Books, Eaton Publishing Co. (2000) (ISBN 1-881299-34-1), 
incorporated herein by reference in its entirety. 

Other transfection techniques include transfection by 
particle embardment . See, e.g., Cheng et al . , Proc. Natl. 
Acad. Sci. USA 90 (10) : 4455-9 (1993); Yang et al . , Proc. Natl. 
Acad. Sci. USA 87(24) : 9568-72 (1990). 

PROTEINS 

In another aspect, the present invention provides 
HTPL proteins, various fragments thereof suitable for use as 
antigens (e.g., for epitope mapping) and for use as immunogens 
(e.g., for raising antibodies or as vaccines), fusions of HTPL 
polypeptides and fragments to heterologous polypeptides, and 
conjugates of the proteins, fragments, and fusions of the 
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present invention to other moieties {e.g., to carrier proteins, 
to f luorophores) . 

FIG. 3 and FIG. 4 presents the predicted amino acid 
sequences encoded by the HTPL cDNA clones. The amino acid 
sequence is further presented, respectively, in SEQ ID NO: 3 
(full length amino acid coding sequence of human HTPL-L) , and 6 
(full length amino acid coding sequence of human HTPL-S) . 

Unless otherwise indicated, amino acid sequences of 
the proteins of the present invention were determined as a 
predicted translation from a nucleic acid sequence. 
Accordingly, any amino acid sequence presented herein may 
contain errors due to errors in the nucleic acid sequence, as 
described in detail above. Furthermore, single nucleotide 
polymorphisms (SNPs) occur frequently in eukaryotic genomes - 
more than 1.4 million SNPs have already identified in the human 
genome, International Human Genome Sequencing Consortium, 
Mature 409:860 - 921 (2001) - and the sequence determined from 
one individual of a species may differ from other allelic forms 
present within the population. Small deletions and insertions 
can often be found that does not alter the function of the 
protein. 

Accordingly, it is an aspect of the present invention 
to provide proteins not only identical in sequence to those 
described with particularity herein, but also to provide 
isolated proteins at least about 65% identical in sequence to 
those described with particularity herein, typically at least 
about 70%, 75%, 80%, 85%, or 90% identical in sequence to those 
described with particularity herein, usefully at least about 
91%, 92%, 93%, 94%, or 95% identical in sequence to those 
described with particularity herein, usefully at least about 
96%, 97%, 98%, or 99% identical in sequence to those described 
with particularity herein, and, most conservatively, at least 
about 99.5%, 99.6%, 99.7%, 99.8% and 99.9% identical in 
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sequence to those described with particularity herein. These 
sequence variants can be naturally occurring or can result from 
human intervention by way of random or directed mutagenesis. 

For purposes herein, percent identity of two amino 
acid sequences is determined using the procedure of Tatiana et 
al. r "Blast 2 sequences - a new tool for comparing protein and 
nucleotide sequences", FEMS Microbiol Lett. 174:247-250 (1999), 
which procedure is effectuated by the computer program BLAST 2 
SEQUENCES, available online at 

http : //www . ncbi . nlm . nih . gov/blast /bl 2 seq/bl2 . html , 
To assess percent identity of amino acid sequences, the BLASTP 
module of BLAST 2 SEQUENCES is used with default values of (i) 
BLOSUM62 matrix, Henikoff et al . , Proc . Natl. Acad. Sci USA 
89(22) : 10915-9 (1992); (ii) open gap 11 and extension gap 1 
penalties; and (iii) gap x_dropoff 50 expect 10 word size 3 
filter, and both sequences are entered in their entireties. 

As is well known, amino acid substitutions occur 
frequently among natural allelic variants, with conservative 
substitutions often occasioning only de minimis change in 
protein function. 

Accordingly, it is an aspect of the present invention 
to provide proteins not only identical in sequence to those 
described with particularity herein, but also to provide 
isolated proteins having the sequence of HTPL proteins, or 
portions thereof, with conservative amino acid substitutions. 
It is a further aspect to provide isolated proteins having the 
sequence of HTPL proteins, and portions thereof, with 
moderately conservative amino acid substitutions. These 
conservatively- substituted and moderately conservatively- 
substituted variants can be naturally occurring or can result 
from human intervention. 

Although there are a variety of metrics for calling 
conservative amino acid substitutions, based primarily on 
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either observed changes among evolutionarily related proteins 
or on predicted chemical similarity, for purposes herein a 
conservative replacement is any change having a positive value 
in the PAM250 log- likelihood matrix reproduced herein below 
(see Gonnet et al . , Science 256 (5062) • 1443-5 (1992)): 
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For purposes herein, a "moderately conservative" replacement is 
any change having a nonnegative value in the PAM250 log- 
likelihood matrix reproduced herein above. 

As is also well known in the art, relatedness of 
proteins can also be characterized using a functional test, the 
ability of the encoding nucleic acids to base-pair to one 
another at defined hybridization stringencies. 
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It is, therefore, another aspect of the invention to 
provide isolated proteins not only identical in sequence to 
those described with particularity herein, but also to provide 
isolated proteins ("hybridization related proteins") that are 
encoded by nucleic acids that hybridize under high stringency 
conditions (as defined herein above) to all or to a portion of 
various of the isolated nucleic acids of the present invention 
("reference nucleic acids")- It is a further aspect of the 
invention to provide isolated proteins ("hybridization related 
proteins") that are encoded by nucleic acids that hybridize 
under moderate stringency conditions (as defined herein above) 
to all or to a portion of various of the isolated nucleic acids 
of the present invention ("reference nucleic acids") . 

The hybridization related proteins can be alternative 
isoforms, homologues, paralogues, and orthologues of the HTPL 
protein of the present invention. Particularly useful 
orthologues are those from other primate species, such as 
chimpanzee, rhesus macaque monkey, baboon, orangutan, and 
gorilla, from rodents, such as rats, mice, guinea pigs; from 
lagomorphs, such as rabbits, and from domestic livestock, such 
as cow, pig, sheep, horse, and goat. 

Relatedness of proteins can also be characterized 
using a second functional test, the ability of a first protein 
competitively to inhibit the binding of a second protein to an 
antibody. 

It is, therefore, another aspect of the present 
invention to provide isolated proteins not only identical in 
sequence to those described with particularity herein, but also 
to provide isolated proteins ("cross-reactive proteins") that 
competitively inhibit the binding of antibodies to all or to a 
portion of various of the isolated HTPL proteins of the present 
invention ("reference proteins") . Such competitive inhibition 
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can readily be determined using immunoassays well known in the 
art . 

Among the proteins of the present invention that 
differ in amino acid sequence from those described with 
particularity herein — including those that have deletions and 
insertions causing up to 10% non- identity, those having 
conservative or moderately conservative substitutions, 
hybridization related proteins, and cross-reactive proteins — 
those that substantially retain one or more HTPL activities are 
particularly useful. As described above, those activities 
include regulating male germ cell development and tumor 
suppression. 

Residues that are tolerant of change while retaining 
function can be identified by altering the protein at known 
residues using methods known in the art, such as alanine 
scanning mutagenesis, Cunningham et al . , Science 
244 (4908) : 1081-5 (1989); transposon linker scanning 
mutagenesis, Chen et al., Gene 263 (1-2 ): 39-48 (2001); 
combinations of homolog- and alanine -scanning mutagenesis, Jin 
et al., J. Mol. Biol. 226 (3) : 851-65 (1992); combinatorial 
alanine scanning, Weiss et al., Proc. Natl. Acad. Sci USA 
97 (16) : 8950-4 (2000), followed by functional assay. Transposon 
linker scanning kits are available commercially (New England 
Biolabs, Beverly, MA, USA, catalog, no. E7-102S; EZ::TN™ 
In-Frame Linker Insertion Kit, catalogue no. EZI04KN, Epicentre 
Technologies Corporation, Madison, WI , USA) . 

As further described below, the isolated proteins of 
the present invention can readily be used as specific 
immunogens to raise antibodies that specifically recognize HTPL 
proteins, their isoforms, homologues, paralogues, and/or 
orthologues. The antibodies, in turn, can be used, inter alia, 
specifically to assay for the HTPL proteins of the present 
invention — e.g. by ELISA for detection of protein fluid 
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samples, such as serum, by immunohistochemistry or laser 
scanning cytometry, for detection of protein in tissue samples, 
or by flow cytometry, for detection of intracellular protein in 
cell suspensions - for specific antibody-mediated isolation 
and/or purification of HTPL proteins, as for example by 
immunoprecipitation, and for use as specific agonists or 
antagonists of HTPL action. 

The isolated proteins of the present invention are 
also immediately available for use as specific standards in 
assays used to determine the concentration and/or amount 
specifically of the HTPL proteins of the present invention. As 
is well known, ELISA kits for detection and quantitation of 
protein analytes typically include isolated and purified 
protein of known concentration for use as a measurement 
standard (e.g., the human interferon-? OptEIA kit, catalog no. 
555142, Pharmingen, San Diego, CA, USA includes human 
recombinant gamma interferon, baculovirus produced) . 

The isolated proteins of the present invention are 
also immediately available for use as specific biomolecule 
capture probes for surf ace -enhanced laser desorption ionization 
(SELDI) detection of protein-protein interactions, WO 98/59362; 
WO 98/59360; WO 98/59361; and Merchant et al . , Electrophoresis 
21 (6) : 1164-77 (2000), the disclosures of which are incorporated 
herein by reference in their entireties. Analogously, the 
isolated proteins of the present invention are also immediately 
available for use as specific biomolecule capture probes on 
BIACORE surface plasmon resonance probes. . See Weinberger et 
al., Pharmacogenomics 1 (4) : 395-416 (2000); Malmqvist, Biochem. 
Soc. Trans. 27(2):335-40 (1999). 

The isolated proteins of the present invention are 
also useful as a therapeutic supplement in patients having a 
specific deficiency in HTPL production. 
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In another aspect, the invention also provides 
fragments of various of the proteins of the present invention. 

The protein fragments are useful, inter alia, as antigenic and 
immunogenic fragments of HTPL . 

By "fragments" of a protein is here intended isolated 
proteins (equally, polypeptides, peptides, oligopeptides) , 
however obtained, that have an amino acid sequence identical to 
a portion of the reference amino acid sequence, which portion 
is at least 6 amino acids and less than the entirety of the 
reference nucleic acid. As so defined, "fragments" need not be 
obtained by physical fragmentation of the reference protein, 
although such provenance is not thereby precluded. 

Fragments of at least 6 contiguous amino acids are 
useful in mapping B cell and T cell epitopes of the reference 
protein. See, e.g., Geysen et al . , "Use of peptide synthesis 
to probe viral antigens for epitopes to a resolution of a 
single amino acid," Proc . Natl. Acad. Sci. USA 81:3998-4002 
(1984) and U.S. Pat. Nos . 4,708,871 and 5,595,915, the 
disclosures of which are incorporated herein by reference in 
their entireties. Because the fragment need not itself be 
immunogenic, part of an immunodominant epitope, nor even 
recognized by native antibody, to be useful in such epitope 
mapping, all fragments of at least 6 amino acids of the 
proteins of the present invention have utility in such a study. 

Fragments of at least 8 contiguous amino acids, often 
at least 15 contiguous amino acids, have utility as immunogens 
for raising antibodies that recognize the proteins of the 
present invention. See, e.g., Lerner, "Tapping the 
immunological repertoire to produce antibodies of predetermined 
specificity," Nature 299:592-596 (1982); Shinnick et al . , 
"Synthetic peptide immunogens as vaccines," Annu. Rev. 
Microbiol. 37:425-46 (1983); Sutcliffe et al . , "Antibodies that 
react with predetermined sites on proteins," Science 219:660-6 
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(1983) , the disclosures of which are incorporated herein by- 
reference in their entireties. As further described in the 
above-cited references, virtually all 8-mers, conjugated to a 
carrier, such as a protein, prove immunogenic — that is, prove 
capable of eliciting antibody for the conjugated peptide; 
accordingly, all fragments of at least 8 amino acids of the 
proteins of the present invention have utility as immunogens . 

Fragments of at least 8, 9, 10 or 12 contiguous amino 
acids are also useful as competitive inhibitors of binding of 
the entire protein, or a portion thereof, to antibodies (as in 
epitope mapping) , and to natural binding partners, such as 
subunits in a multimeric complex or to receptors or ligands of 
the subject protein; this competitive inhibition permits 
identification and separation of molecules that bind 
specifically to the protein of interest, U.S. Pat. Nos . 
5,53 9,084 and 5,783,674, incorporated herein by reference in 
their entireties. 

The protein, or protein fragment, of the present 
invention is thus at least 6 amino acids in length, typically 
at least 8, 9, 10 or 12 amino acids in length, and often at 
least 15 amino acids in length. Often, the protein or the 
present invention, or fragment thereof, is at least 20 amino 
acids in length, even 25 amino acids, 3 0 amino acids, 3 5 amino 
acids, or 50 amino acids or more in length. Of course, larger 
fragments having at least 75 amino acids, 100 amino acids, or 
even 150 amino acids are also useful, and at times preferred. 

The present invention further provides fusions of 
each of the proteins and protein fragments of the present 
invention to heterologous polypeptides. 

By fusion is here intended that the protein or 
protein fragment of the present invention is linearly 
contiguous to the heterologous polypeptide in a peptide -bonded 
polymer of amino acids or amino acid analogues; by 
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"heterologous polypeptide" is here intended a polypeptide that 
does not naturally occur in contiguity with the protein or 
protein fragment of the present invention. As so defined, the 
fusion can consist entirely of a plurality of fragments of the 
5 HTPL protein in altered arrangement; in such case, any of the 
HTPL fragments can be considered heterologous to the other HTPL 
fragments in the fusion protein. More typically, however, the 
heterologous polypeptide is not drawn from the HTPL protein 
itself. 

10 The fusion proteins of the present invention will 

include at least one fragment of the protein of the present 
invention, which fragment is at least 6, typically at least 8, 
often at least 15, and usefully at least 16, 17, 18, 19, or 20 
amino acids long. The fragment of the protein of the present 

15 to be included in the fusion can usefully be at least 25 amino 
acids long, at least 50 amino acids long, and can be at least 
75, 100, or even 150 amino acids long. Fusions that include 
the entirety of the proteins of the present invention have 
particular utility. 

20 The heterologous polypeptide included within the 

fusion protein of the present invention is at least 6 amino 
acids in length, often at least 8 amino acids in length, and 
usefully at least 15, 20, and 25 amino acids in length. 
Fusions that include larger polypeptides, such as the IgG Fc 

25 region, and even entire proteins {such as GFP chromophore- 
containing proteins), have particular utility. 

As described above in the description of vectors and 
expression vectors of the present invention, which discussion 
is incorporated herein by reference in its entirety, 

30 heterologous polypeptides to be included in the fusion proteins 
of the present invention can usefully include those designed to 
facilitate purification and/or visualization of recombinantly- 
expressed proteins. Although purification tags can also be 
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incorporated into fusions that are chemically synthesized, 
chemical synthesis typically provides sufficient purity that 
further purification by HPLC suffices; however, visualization 
tags as above described retain their utility even when the 
5 protein is produced by chemical synthesis, and when so included 
render the fusion proteins of the present invention useful as 
directly detectable markers of HTPL presence. 

As also discussed above, heterologous polypeptides to 
be included in the fusion proteins of the present invention can 

10 usefully include those that facilitate secretion of 

recombinantiy expressed proteins — into the peripiasinic space 
or extracellular milieu for prokaryotic hosts, into the culture 
medium for eukaryotic cells - through incorporation of 
secretion signals and/or leader sequences. 

15 Other useful protein fusions of the present invention 

include those that permit use of the protein of the present 
invention as bait in a yeast two-hybrid system. See Bartel et 
al . (eds.), The Yeast Two-Hybrid System , Oxford University 
Press (1997) (ISBN: 0195109384); Zhu et al . , Yeast Hybrid 

20 Technologies , Eaton Publishing, (2000) (ISBN 1-881299-15-5); 

Fields et al . , Trends Genet. 10(8):286-92 (1994); Mendelsohn et 
al., Curr. Opin. Biotechnol . 5(5):482-6 (1994); Luban et al . , 
Curr. Opin. Biotechnol. 6(l):59-64 (1995); Allen et al . , Trends 
Biochem. Sci . 20 (12) :511-6 (1995); Drees, Curr. Opin. Chem. 

25 Biol. 3(l):64-70 (1999); Topcu et al . , Pharm. Res. 

17 (9) :1049-55 (2000); Fashena et al . , Gene 250 (1-2) : 1-14 
(2000) , the disclosures of which are incorporated herein by 
reference in their entireties. Typically, such fusion is to 
either E. coli LexA or yeast GAL4 DNA binding domains. Related 

30 bait plasmids are available that express the bait fused to a 
nuclear localization signal. 

Other useful protein fusions include those that 
permit display of the encoded protein on the surface of a phage 
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or cell, fusions to intrinsically fluorescent proteins, such as 
green fluorescent protein (GFP) , and fusions to the IgG Fc 
region, as described above, which discussion is incorporated 
here by reference in its entirety. 
5 The proteins and protein fragments of the present 

invention can also usefully be fused to protein toxins, such as 
Pseudomonas exotoxin A, diphtheria toxin, shiga toxin A, 
anthrax toxin lethal factor, ricin, in order to effect ablation 
of cells that bind or take up the proteins of the present 
10 invention. 

The isolated proteins, protein fragments, and protein 
fusions of the present invention can be composed of natural 
amino acids linked by native peptide bonds, or can contain any 
or all of nonnatural amino acid analogues, nonnative bonds, and 

15 post- synthetic (post translational) modifications, either 
throughout the length of the protein or localized to one or 
more portions thereof. 

As is well known in the art, when the isolated 
protein is used, e.g., for epitope mapping, the range of such 

20 nonnatural analogues, nonnative inter- residue bonds, or post- 
synthesis modifications will be limited to those that permit 
binding of the peptide to antibodies. When used as an 
immunogen for the preparation of antibodies in a non-human 
host, such as a mouse, the range of such nonnatural analogues, 

25 nonnative inter-residue bonds, or post-synthesis modifications 
will be limited to those that do not interfere with the 
immunogenicity of the protein. When the isolated protein is 
used as a therapeutic agent, such as a vaccine or for 
replacement therapy, the range of such changes will be limited 

30 to those that do not confer toxicity upon the isolated protein. 

Non-natural amino acids can be incorporated during 
solid phase chemical synthesis or by recombinant techniques, 
although the former is typically more common. 
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Solid phase chemical synthesis of peptides is well 
established in the art. Procedures are described, inter alia, 
in Chan et al . (eds.), Fmoc Solid Phase Peptide Synthesis: A 
Practical Approach (Practical Approach Series) , Oxford Univ. 
Press (March 2000) (ISBN: 0199637245); Jones, Amino Acid and 
Peptide Synthesis (Oxford Chemistry Primers, No 7) , Oxford 
Univ. Press (August 1992) (ISBN: 0198556683) ; and Bodanszky, 
Principles of Peptide Synthesis (Springer Laboratory) , Springer 
Verlag (December 1993) (ISBN: 0387564314), the disclosures of 
which are incorporated herein by reference in their entireties. 

For example, D-enantiomers of natural amino acids uan 
readily be incorporated during chemical peptide synthesis: 
peptides assembled from D-amino acids are more resistant to 
proteolytic attack; incorporation of D-enantiomers can also be 
used to confer specific three dimensional conformations on the 
peptide. Other amino acid analogues commonly added during 
chemical synthesis include ornithine, norleucine, 
phosphorylated amino acids (typically phosphoserine , 
phosphothreonine , phosphotyrosine) , L-malonyltyrosine , a non- 
hydrolyzable analog of phosphotyrosine (Kole et al . , Biochem. 
Biophys. Res. Com. 2 09:817-821 (1995)), and various halogenated 
phenylalanine derivatives. 

Amino acid analogues having detectable labels are 
also usefully incorporated during synthesis to provide a 
labeled polypeptide. 

Biotin, for example (indirectly detectable through 
interaction with avidin, streptavidin, neutravidin, captavidin, 
or anti -biotin antibody) , can be added using biotinoyl-- 
(9-f luorenylmethoxycarbonyl) -L-lysine (FMOC biocytin) 
(Molecular Probes, Eugene, OR, USA) . (Biotin can also be added 
enzymatically by incorporation into a fusion protein of a E. 
coli BirA substrate peptide.) 
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The FMOC and tBOC derivatives of dabcyl-L- lysine 
(Molecular Probes, Inc., Eugene, OR, USA) can be used to 
incorporate the dabcyl chromophore at selected sites in the 
peptide sequence during synthesis. The aminonaphthalene 
derivative EDANS, the most common fluorophore for pairing with 
the dabcyl quencher in fluorescence resonance energy transfer 
(FRET) systems, can be introduced during automated synthesis of 
peptides by using EDANS- - FMOC -L- glutamic acid or the 
corresponding tBOC derivative (both from Molecular Probes, 
Inc., Eugene, OR, USA). Tetramethylrhodamine fluorophores can 
be incorporated during automated FMOC synthesis of peptides 
using (FMOC) - -TMR-L- lysine (Molecular Probes, Inc. Eugene, OR, 
USA) . 

Other useful amino acid analogues that can be 
incorporated during chemical synthesis include aspartic acid, 
glutamic acid, lysine, and tyrosine analogues having allyl 
side-chain protection (Applied Biosystems, Inc., Foster City, 
CA, USA) ; the allyl side chain permits synthesis of cyclic, 
branched-chain, sulfonated, glycosylated, and phosphorylated 
peptides . 

A large number of other FMOC-protected non-natural 
amino acid analogues capable of incorporation during chemical 
synthesis are available commercially, including, e.g., Fmoc-2- 
aminobicyclo [2.2.1] heptane- 2 -carboxylic acid, Fmoc-3-endo- 
aminobicyclo [2.2.1] heptane- 2 -endo- carboxylic acid, Fmoc-3-exo- 
aminobicyclo [2 . 2 . 1] heptane- 2 -exo-carboxylic acid, Fmoc-3-endo- 
amino-bicyclo [2 . 2 . 1] hept-5-ene-2 -endo-carboxylic acid, Fmoc-3- 
exo-amino-bicyclo [2 . 2 . 1] hept-5-ene-2-exo-carboxylic acid, Fmoc- 
cis-2-amino-l-cyclohexanecarboxylic acid, Fmoc -trans -2 -amino- 1- 
cyclohexanecarboxylic acid, Fmoc -1 -amino- 1- 
cyclopentanecarboxylic acid, Fmoc-cis-2-amino-l- 
cyclopentanecarboxylic acid, Fmoc- 1 -amino- 1- 
cyclopropanecarboxylic acid, Fmoc-D-2 -amino- 4 - 
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(ethylthio) butyric acid, Fmoc-L-2 -amino-4 - (ethylthio) butyric 
acid, Fmoc-L-buthionine, Fmoc-S -methyl -L-Cysteine , Fmoc-2- 
aminobenzoic acid (anthranillic acid) , Fmoc-3-aminobenzoic 
acid, Fmoc-4 -aminobenzoic acid, Fmoc-2-aminobenzophenone-2 1 - 
5 carboxylic acid, Fmoc-N- (4 -aminobenzoyl ) -b-alanine , Fmoc-2- 
amino-4 , 5 -dimethoxybenzoic acid, Fmoc-4-aminohippuric acid, 
Fmoc-2-amino-3 -hydroxybenzoic acid, Fmoc -2 -amino- 5- 
hydroxybenzoic acid, Fmoc- 3 -amino-4 -hydroxybenzoic acid, Fmoc- 
4 -amino -3 -hydroxybenzoic acid, Fmoc-4 -amino- 2 -hydroxybenzoic 
M 10 acid, Fmoc -5 -amino -2 -hydroxybenzoic acid, Fmoc-2-amino-3 - 

me t hoxyb e n z o i c acid, Fmoc-4 -amino 3 -methoxybenzoic acid, Fmoc - 
CH 2-amino-3-methylbenzoic acid, Fmoc-2-amino-5-methylbenzoic 

Tj acid, Fmoc-2-amino-6-methylbenzoic acid, Fmoc -3 -amino- 2 - 

W methylbenzoic acid, Fmoc-3-amino-4-methylbenzoic acid, Fmoc-4 - 

m 

15 amino-3 -methylbenzoic acid, Fmoc-3 -amino-2 -naphtoic acid, Fmoc- 
p D, L-3-amino-3-phenylpropionic acid, Fmoc-L-Methyldopa , Fmoc-2- 

r~* 

[ij amino-4 , 6 -dimethyl -3 -pyridinecarboxylic acid, Fmoc-D, L- ? -amino- 

2-thiophenacetic acid, Fmoc-4- (carboxymethyl) piperazine, Fmoc- 
fij 4-carboxypiperazine, Fmoc-4- (carboxymethyl) homopiperazine, 

20 Fmoc-4-phenyl-4-piperidinecarboxylic acid, Fmoc-L-1 , 2 , 3 , 4 - 

tetrahydronorharman-3-carboxylic acid, Fmoc-L-thiazolidine-4- 
carboxylic acid, all available from The Peptide Laboratory 
(Richmond, CA, USA) . 

Non-natural residues can also be added 

25 biosynthetically by engineering a suppressor tRNA, typically 
one that recognizes the UAG stop codon, by chemical 
aminoacylation with the desired unnatural amino acid and. 
Conventional site-directed mutagenesis is used to introduce the 
chosen stop codon UAG at the site of interest in the protein 

30 gene . When the acylated suppressor tRNA and the mutant gene 
are combined in an in vitro transcription/translation system, 
the unnatural amino acid is incorporated in response to the UAG 
codon to give a protein containing that amino acid at the 

98 



specified position. Liu et al . , Proc . Natl Acad. Sci . USA 
96(9):4780-5 (1999); Wang et al . , Science 292 (5516 ): 498-500 
(2001) . 

The isolated proteins, protein fragments and fusion 
proteins of the present invention can also include nonnative 
inter-residue bonds, including bonds that lead to circular and 
branched forms . 

The isolated proteins and protein fragments of the 
present invention can also include post-translational and post- 
synthetic modifications, either throughout the length of the 
pxoLein or localized to one or more portions thereof. 

For example, when produced by recombinant expression 
in eukaryotic cells, the isolated proteins, fragments, and 
fusion proteins of the present invention will typically include 
N-linked and/or 0-linked glycosylation, the pattern of which 
will reflect both the availability of glycosylation sites on 
the protein sequence and the identity of the host cell. 
Further modification of glycosylation pattern can be performed 
enzymatically . 

As another example, recombinant polypeptides of the 
invention may also include an initial modified methionine 
residue, in some cases resulting from host-mediated processes. 

When the proteins, protein fragments, and protein 
fusions of the present invention are produced by chemical 
synthesis, post -synthetic modification can be performed before 
deprotection and cleavage from the resin or after deprotection 
and cleavage. Modification before deprotection and cleavage of 
the synthesized protein often allows greater control, e.g. by 
allowing targeting of the modifying moiety to the N-terminus of 
a resin-bound synthetic peptide. 

Useful post-synthetic (and post-translational) 
modifications include conjugation to detectable labels, such as 
f luorophores . 
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A wide variety of amine -reactive and thiol-reactive 
fluorophore derivatives have been synthesized that react under 
nondenaturing conditions with N-terminal amino groups and 
epsilon amino groups of lysine residues, on the one hand, and 
with free thiol groups of cysteine residues, on the other. 

Kits are available commercially that permit 
conjugation of proteins to a variety of amine-reactive or 
thiol-reactive f luorophores : Molecular Probes, Inc. (Eugene, 
OR, USA), e.g., offers kits for conjugating proteins to Alexa 
Fluor 350, Alexa Fluor 430, Fluorescein-EX, Alexa Fluor 488, 
Oregon Green 468, Alexa Fluor 532, Alexa Fluor 546, Alexa Fluor 
546, Alexa Fluor 568, Alexa Fluor 594, and Texas Red-X. 

A wide variety of other amine-reactive and thiol- 
reactive fluorophores are available commercially (Molecular 
Probes, Inc., Eugene, OR, USA), including Alexa Fluor® 350, 
Alexa Fluor® 488, Alexa Fluor® 532, Alexa Fluor® 546, Alexa 
Fluor® 568, Alexa Fluor® 594, Alexa Fluor® 647 (monoclonal 
antibody labeling kits available from Molecular Probes, Inc., 
Eugene, OR, USA), BODIPY dyes, such as BODIPY 493/503, BODIPY 
FL, BODIPY R6G, BODIPY 530/550, BODIPY TMR, BODIPY 558/568, 
BODIPY 558/568, BODIPY 564/570, BODIPY 576/589, BODIPY 581/591, 
BODIPY TR, BODIPY 630/650, BODIPY 650/665, Cascade Blue, 
Cascade Yellow, Dansyl, lissamine rhodamine B, Marina Blue, 
Oregon Green 4 88, Oregon Green 514, Pacific Blue, rhodamine 6G, 
rhodamine green, rhodamine red, tetramethylrhodamine , Texas Red 
(available from Molecular Probes, Inc., Eugene, OR, USA). 

The polypeptides of the present invention can also be 
conjugated to fluorophores, other proteins, and other 
macromolecules, using bifunctional linking reagents. 

Common homobi functional reagents include, e.g., APG, 
AEDP, BASED, BMB, BMDB, BMH, BMOE, BM[PEO]3, BM[PEO]4, BS3 , 
BSOCOES, DFDNB, DMA, DMP, DMS, DPDPB, DSG, DSP (Lomant's 
Reagent) , DSS, DST, DTBP, DTME, DTSSP, EGS, HBVS, 
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Sulfo-BSOCOES, Sulfo-DST, Sulfo-EGS (all available from Pierce, 
Rockford, IL, USA) ; common heterobif unctional cross-linkers 
include ABH, AMAS , ANB-NOS, APDP, ASBA, BMPA, BMPH, BMPS , EDC , 
EMCA, EMCH, EMCS, KMUA, KMUH, GMBS , LC-SMCC, LC-SPDP, MBS , 
M2C2H, MPBH, MSA, NHS -ASA, PDPH, PMPI , SADP, SAED, SAND , 
SANPAH, SASD, SATP, SBAP, SFAD , SIA, SIAB, SMCC, SMPB, SMPH, 
SMPT, SPDP, Sulfo-EMCS, Sulfo-GMBS, Sulfo-HSAB, Sulfo-KMUS, 
Sulfo-LC-SPDP, Sulfo-MBS, Sulf o-NHS-LC-ASA, Sulfo-SADP, 
Sulfo-SANPAH, Sulfo-SIAB, Sulfo-SMCC, Sulfo-SMPB, 
Sulfo-LC-SMPT, SVSB, TFCS (all available Pierce, Rockford, IL, 
USA) . 

The proteins, protein fragments, and protein fusions 
of the present invention can be conjugated, using such cross- 
linking reagents, to fluorophores that are not amine- or thiol - 
reactive . 

Other labels that usefully can be conjugated to the 
proteins, protein fragments, and fusion proteins of the present 
invention include radioactive labels, echosonographic contrast 
reagents, and MR I contrast agents. 

The proteins, protein fragments, and protein fusions 
of the present invention can also usefully be conjugated using 
cross- linking agents to carrier proteins, such as KLH, bovine 
thyroglobulin, and even bovine serum albumin (BSA) , to increase 
immunogenicity for raising anti-HTPL antibodies. 

The proteins, protein fragments, and protein fusions 
of the present invention can also usefully be conjugated to 
polyethylene glycol (PEG) ; PEGylation increases the serum half 
life of proteins administered intravenously for replacement 
therapy. Delgado et al . , Crit. Rev. Ther . Drug Carrier Syst. 
9 (3-4) :249-304 (1992); Scott et al . , Curr. Pharm. Des . 
4(6):423-38 (1998); DeSantis et al . , Curr. Opin. Biotechnol . 
10(4):324-30 (1999), incorporated herein by reference in their 
entireties. PEG monomers can be attached to the protein 
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directly or through a linker, with PEGylation using PEG 
monomers activated with tresyl chloride 

(2, 2 , 2-trif luoroethanesulphonyl chloride) permitting direct 
attachment under mild conditions. 

The isolated proteins of the present invention, 
including fusions thereof, can be produced by recombinant 
expression, typically using the expression vectors of the 
present invention as above-described or, if fewer than about 
100 amino acids, by chemical synthesis (typically, solid phase 
synthesis), and, on occasion, by in vitro translation. 

Production of the isolated proteins of the present 
invention can optionally be followed by purification. 
Purification of recombinantly expressed proteins is now well 
within the skill in the art. See, e.g., Thorner efc al . (eds.), 
Applications of Chimeric Genes and Hybrid Proteins, Part A: 
Gene Expression and Protein Purification (Methods in 
Enzymology, Volume 326), Academic Press (2000), (ISBN: 
0121822273); Harbin (ed. ) , Cloning, Gene Expression and Protein 
Purification : Experimental Procedures and Process Rationale , 
Oxford Univ. Press (2001) (ISBN: 0195132947); Marshak et al . , 
Strategies for Protein Purification and Characterization: A 
Laboratory Course Manual , Cold Spring Harbor Laboratory Press 
(1996) (ISBN: 0-87969-385-1); and Roe (ed.), Protein 
Purification Applications , Oxford University Press (2001) , the 
disclosures of which are incorporated herein by reference in 
their entireties, and thus need not be detailed here. 

Briefly, however, if purification tags have been 
fused through use of an expression vector that appends such 
tag, purification can be effected, at least in part, by means 
appropriate to the tag, such as use of immobilized metal 
affinity chromatography for polyhistidine tags. Other 
techniques common in the art include ammonium sulfate 
fractionation, immunoprecipitation, fast protein liquid 
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chromatography (FPLC) , high performance liquid chromatography 
(HPLC) , and preparative gel electrophoresis. 

Purification of chemically-synthesized peptides can 
readily be effected, e.g., by HPLC. 

Accordingly, it is an aspect of the present invention 
to provide the isolated proteins of the present invention in 
pure or substantially pure form. 

A purified protein of the present invention is an 
isolated protein, as above described, that is present at a 
concentration of at least 95%, as measured on a weight basis 
(w/w) with respect to total protein in a composition. Such 
purities can often be obtained during chemical synthesis 
without further purification, as, e.g., by HPLC. Purified 
proteins of the present invention can be present at a 
concentration (measured on a weight basis with respect to total 
protein in a composition) of 96%, 97%, 98%, and even 99%. The 
proteins of the present invention can even be present at levels 
of 99.5%, 99.6%, and even 99.7%, 99.8%, or even 99.9% following 
purification, as by HPLC. 

Although high levels of purity are particularly 
useful when the isolated proteins of the present invention are 
used as therapeutic agents - such as vaccines, or for 
replacement therapy - the isolated proteins of the present 
invention are also useful at lower purity. For example, 
partially purified proteins of the present invention can be 
used as immunogens to raise antibodies in laboratory animals. 

Thus, in another aspect, the present invention 
provides the isolated proteins of the present invention in 
substantially purified form. A "substantially purified 
protein" of the present invention is an isolated protein, as 
above described, present at a concentration of at least 70%, 
measured on a weight basis with respect to total protein in a 
composition. Usefully, the substantially purified protein is 
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present at a concentration, measured on a weight basis with 
respect to total protein in a composition, of at least 75%, 
80%, or even at least 85%, 90%, 91%, 92%, 93%, 94%, 94.5% or 
even at least 94.9%. 
5 In preferred embodiments, the purified and 

substantially purified proteins of the present invention are in 
compositions that lack detectable ampholytes, acrylamide 
monomers, bis-acrylamide monomers, and polyacrylamide . 

The proteins, fragments, and fusions of the present 

10 invention can usefully be attached to a substrate. The 

substrate can porous or solid, planar or non-planar; the bond 
can be covalent or noncovalent . 

For example, the proteins, fragments, and fusions of 
the present invention can usefully be bound to a porous 

15 substrate, commonly a membrane, typically comprising 

nitrocellulose, polyvinyl idene fluoride (PVDF) , or cationically 
derivatized, hydrophilic PVDF; so bound, the proteins, 
fragments, and fusions of the present invention can be used to 
detect and quantify antibodies, e.g. in serum, that bind 

20 specifically to the immobilized protein of the present 
invention . 

As another example, the proteins, fragments, and 
fusions of the present invention can usefully be bound to a 
substantially nonporous substrate, such as plastic, to detect 

25 and quantify antibodies, e.g. in serum, that bind specifically 
to the immobilized protein of the present invention. Such 
plastics include polymethylacrylic , polyethylene, 
polypropylene, polyacrylate, polymethylmethacrylate, 
polyvinylchloride, polytetraf luoroethylene , polystyrene, 

30 polycarbonate, polyacetal, polysulfone, celluloseacetate, 

cellulosenitrate, nitrocellulose, or mixtures thereof; when the 
assay is performed in standard microtiter dish, the plastic is 
typically polystyrene. 



The proteins, fragments, and fusions of the present 
invention can also be attached 'to a substrate suitable for use 
as a surface enhanced laser desorption ionization source; so 
attached, the protein, fragment, or fusion of the present 
invention is useful for binding and then detecting secondary 
proteins that bind with sufficient affinity or avidity to the 
surface-bound protein to indicate biologic interaction 
therebetween. The proteins, fragments, and fusions of the 
present invention can also be attached to a substrate suitable 
for use in surface plasmon resonance detection; so attached, 
the protein., fragment, or fusion of the present invention is 
useful for binding and then detecting secondary proteins that 
bind with sufficient affinity or avidity to the surface-bound 
protein to indicate biological interaction therebetween. 

HTPL Proteins 

In a first series of protein embodiments, the 
invention provides an isolated HTPL-L polypeptide having the 
amino acid sequence in SEQ ID NO: 3, which are full length 
HTPL-L proteins. When used as immunogens, the full length 
proteins of the present invention can be used, inter alia, to 
elicit antibodies that bind to a variety of epitopes of the 
HTPL-L protein. 

The invention further provides fragments of the 
above -described polypeptides, particularly fragments having at 
least 6 amino acids, typically at least 8 amino acids, often at 
least 15 amino acids, and even the entirety of the sequence 
given in SEQ ID NO: 3. 

In another series of protein embodiments, the 
invention provides an isolated HTPL-S polypeptide having the 
amino acid sequence in SEQ ID NO: 6, which are full length 
HTPL-S proteins. When used as immunogens, the full length 
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proteins of the present invention can be used, inter alia, to 
elicit antibodies that bind to a variety of epitopes of the 
HTPL-S protein. 

The invention further provides fragments of the 
5 above -described polypeptides, particularly fragments having at 
least 6 amino acids, typically at least 8 amino acids, often at 
least 15 amino acids, and even the entirety of the sequence 
given in SEQ ID NO: 6. 

The invention further provides fragments of at least 
10 6 amino acids, typically at least 8 amino acids, often at least 
15 amino arids,- and even the entirety of the sequence given in 



m SEQ ID NO: 10, SEQ ID NO: 14, SEQ ID NO: 4799. 

»i As described above, the invention further provides 

proteins that differ in sequence from those described with 
15 particularity in the above -referenced SEQ ID NOs . , whether by 
Q wa Y °f insertion or deletion, by way of conservative or 

moderately conservative substitutions, as hybridization related 
□ proteins, or as cross-hybridizing proteins, with those that 

:;j substantially retain a HTPL activity particularly useful. 

20 The invention further provides fusions of the 

proteins and protein fragments herein described to heterologous 
polypeptides . 
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ANTIBODIES AND ANTIBODY - PRODUCING CELLS 



In another aspect, the invention provides antibodies, 
including fragments and derivatives thereof, that bind 
specifically to HTPL proteins and protein fragments of the 
present invention or to one or more of the proteins and protein 
30 fragments encoded by the isolated HTPL nucleic acids of the 

present invention. The antibodies of the present invention can 
be specific for all of linear epitopes, discontinuous epitopes, 
or conformational epitopes of such proteins or protein 
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fragments, either as present on the protein in its native 
conformation or, in some cases, as present on the proteins as 
denatured, as, e.g., by solubilization in SDS . 

In other embodiments, the invention provides 
5 antibodies, including fragments and derivatives thereof, the 
binding of which can be competitively inhibited by one or more 
of the HTPL proteins and protein fragments of the present 
invention, or by one or more of the proteins and protein 
fragments encoded by the isolated HTPL nucleic acids of the 
10 present invention. 



As used herein, the term "antibody" refers to a 



m polypeptide, at least a portion of which is encoded by at least 

^\ one immunoglobulin gene, which can bind specifically to a first 

M 

if! molecular species, and to fragments or derivatives thereof that 

tM 15 remain capable of such specific binding. 

By "bind specifically" and "specific binding" is here 
intended the ability of the antibody to bind to a first 
molecular species in preference to binding to other molecular 
species with which the antibody and first molecular species are 
20 admixed. An antibody is said specifically to "recognize" a 
first molecular species when it can bind specifically to that 
first molecular species. 

As is well known in the art, the degree to which an 
antibody can discriminate as among molecular species in a 
25 mixture will depend, in part, upon the conformational 

relatedness of the species in the mixture; typically, the 
antibodies of the present invention will discriminate over 
adventitious binding to non-HTPL proteins by at least two- fold, 
more typically by at least 5-fold, typically by more than 10- 
30 fold, 25-fold, 50-fold, 75-fold, and often by more than 100- 
fold, and on occasion by more than 500-fold or 1000-fold. When 
used to detect the proteins or protein fragments of the present 
invention, the antibody of the present invention is 
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sufficiently specific when it can be used to determine the 
presence of the protein of the present invention in samples 
derived from human testis, as well as adrenal, adult and fetal 
liver, bone marrow, brain, kidney, lung, placenta, prostate, 
skeletal muscle and colon. 

Typically, the affinity or avidity of an antibody (or 
antibody multimer, as in the case of an IgM pentamer) of the 
present invention for a protein or protein fragment of the 
present invention will be at least about 1 x 10" 6 molar (M) , 
typically at least about 5 x 10" 7 M, usefully at least about 1 
x 10" 7 M, with affinities and avidities of at least 1 x 10" 8 M, 
5 x 10" 9 M, and 1 x 10~ 10 M proving especially useful. 

The antibodies of the present invention can be 
naturally-occurring forms, such as IgG, IgM, IgD, IgE, and IgA, 
from any mammalian species. 

Human antibodies can, but will infrequently, be drawn 
directly from human donors or human cells. In such case, 
antibodies to the proteins of the present invention will 
typically have resulted from fortuitous immunization, such as 
autoimmune immunization, with the protein or protein fragments 
of the present invention. Such antibodies will typically, but 
will not invariably, be polyclonal. 

Human antibodies are more frequently obtained using 
transgenic animals that express human immunoglobulin genes, 
which transgenic animals can be affirmatively immunized with 
the protein immunogen of the present invention. Human Ig- 
transgenic mice capable of producing human antibodies and 
methods of producing human antibodies therefrom upon specific 
immunization are described, inter alia, in U.S. Patent Nos . 
6,162,963; 6,150,584; 6,114,598; 6,075,181; 5,939,598; 
5,877,397; 5,874,299; 5,814,318; 5,789,650; 5,770,429; 
5,661,016; 5,633,425; 5,625,126; 5,569,825; 5,545,807; 
5,545,806, and 5,591,669, the disclosures of which are 
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incorporated herein by reference in their entireties. Such 
antibodies are typically monoclonal, and are typically produced 
using techniques developed for production of murine antibodies. 

Human antibodies are particularly useful, and often 
5 preferred, when the antibodies of the present invention are to 
be administered to human beings as in vivo diagnostic or 
therapeutic agents, since recipient immune response to the 
administered antibody will often be substantially less than 
that occasioned by administration of an antibody derived from 

10 another species, such as mouse. 

IgG/ IgM/ igD, IgE and IgA antibodies of the present 
invention are also usefully obtained from other mammalian 
species, including rodents — typically mouse, but also rat, 
guinea pig, and hamster — lagomorphs, typically rabbits, and 

15 also larger mammals, such as sheep, goats, cows, and horses. 

In such cases, as with the transgenic human- antibody-producing 
non-human mammals, fortuitous immunization is not required, and 
the non-human mammal is typically affirmatively immunized, 
according to standard immunization protocols, with the protein 

20 or protein fragment of the present invention. 

As discussed above, virtually all fragments of 8 or 
more contiguous amino acids of the proteins of the present 
invention can be used effectively as immunogens when conjugated 
to a carrier, typically a protein such as bovine thyroglobulin, 

25 keyhole limpet hemocyanin, or bovine serum albumin, 

conveniently using a bifunctional linker such as those 
described elsewhere above, which discussion is incorporated by 
reference here . 

Immunogenic ity can also be conferred by fusion of the 

30 proteins and protein fragments of the present invention to 
other moieties. 

For example, peptides of the present invention can be 
produced by solid phase synthesis on a branched polylysine core 
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matrix; these multiple antigenic peptides (MAPs) provide high 
purity, increased avidity, accurate chemical definition and 
improved safety in vaccine development. Tarn et al . , Proc . 
Natl. Acad. Sci. USA 85:5409-5413 (1988); Posnett et al . , J. 
Biol. Chem. 263, 1719-1725 (1988). 

Protocols for immunizing non-human mammals are well- 
established in the art, Harlow et al . (eds.), Antibodies: A 
Laboratory Manual , Cold Spring Harbor Laboratory (1998) (ISBN: 
0879693142); Coligan et al . (eds.), Current Protocols in 
Immunology , John Wiley & Sons, Inc. (2001) (ISBN: 
0-471-52276-7) ; Zola, Monoclonal Antibodies : Preparation and 
Use of Monoclonal Antibodies and Engineered Antibody 
Derivatives (Basics: From Background to Bench) , Springer Verlag 
(2000) (ISBN: 0387915907), the disclosures of which are 
incorporated herein by reference, and often include multiple 
immunizations, either with or without adjuvants such as 
Freund's complete adjuvant and Freund's incomplete adjuvant. 

Antibodies from nonhuman mammals can be polyclonal or 
monoclonal, with polyclonal antibodies having certain 
advantages in immunohistochemical detection of the proteins of 
the present invention and monoclonal antibodies having 
advantages in identifying and distinguishing particular 
epitopes of the proteins of the present invention. 

Following immunization, the antibodies of the present 
invention can be produced using any art-accepted technique. 
Such techniques are well known in the art, Coligan et al . 
(eds.), Current Protocols in Immunology , John Wiley & Sons, 
Inc. (2001) (ISBN: 0-471-52276-7); Zola, Monoclonal Antibodies 
: Preparation and Use of Monoclonal Antibodies and Engineered 
Antibody Derivatives (Basics: From Background to Bench) , 
Springer Verlag (2000) (ISBN: 0387915907); Howard et al . 
(eds.), Basic Methods in Antibody Production and 
Characterization , CRC Press (2000) (ISBN: 0849394457) ; Harlow 
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et al . (eds.), Antibodies: A Laboratory Manual , Cold Spring 
Harbor Laboratory (1998) (ISBN: 0879693142); Davis (ed.), 
Monoclonal Antibody Protocols , Vol. 45, Humana Press (1995) 
(ISBN: 0896033082); Delves (ed.), Antibody Production: 
5 Essential Techniques , John Wiley & Son Ltd (1997) (ISBN: 

0471970107) ; Kenney, Antibody Solution: An Antibody Methods 
Manual , Chapman & Hall (1997) (ISBN: 0412141914), incorporated 
herein by reference in their entireties, and thus need not be 
detailed here. 

10 Briefly, however, such techniques include, inter 

alia, production of monoclonal antibodies by hybridomas and 
expression of antibodies or fragments or derivatives thereof 
p from host cells engineered to express immunoglobulin genes or 

.^i fragments thereof. These two methods of production are not 

CH 15 mutually exclusive: genes encoding antibodies specific for the 
L proteins or protein fragments of the present invention can be 

?« cloned from hybridomas and thereafter expressed in other host 

TA cells. Nor need the two necessarily be performed together: 

e.g., genes encoding antibodies specific for the proteins and 
20 protein fragments of the present invention can be cloned 
directly from B cells known to be specific for the desired 
protein, as further described in U.S. Pat. No. 5,627,052, the 
disclosure of which is incorporated herein by reference in its 
entirety, or from antibody-displaying phage. 
25 Recombinant expression in host cells is particularly 

useful when fragments or derivatives of the antibodies of the 
present invention are desired. 

Host cells for recombinant antibody production - 
either whole antibodies, antibody fragments, or antibody 
30 derivatives - can be prokaryotic or eukaryotic. 

Prokaryotic hosts are particularly useful for 
producing phage displayed antibodies of the present invention. 
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The technology of phage -displayed antibodies, in 
which antibody variable region fragments are fused, for 
example, to the gene III protein (pill) or gene VIII protein 
(pVIII) for display on the surface of filamentous phage, such 
as Ml 3, is by now well-established, Sidhu, Curr. Opin. 
Biotechnol. ll(6):610-6 (2000); Griffiths et aJ., Curr. Opin. 
Biotechnol. 9(l):102-8 (1998); Hoogenboom et al . , 
Immuno technology , 4(l):l-20 (1998); Rader et al . , Current 
Opinion in Biotechnology 8:503-508 (1997); Aujame et al . , Human 
Antibodies 8:155-168 (1997); Hoogenboom, Trends in Biotechnol. 
15:62-70 (1997); de Kruif et al . , 17:453-455 (1996); Barbas et 
al., Trends in Biotechnol. 14:230-234 (1996); Winter et al . , 
Ann. Rev. Immunol. 433-455 (1994) , and techniques and protocols 
required to generate, propagate, screen (pan) , and use the 
antibody fragments from such libraries have recently been 
compiled, Barbas et al . , Phage Display: A Laboratory Manual , 
Cold Spring Harbor Laboratory Press (2001) (ISBN 
0-87969-546-3); Kay et al . (eds.), Phage Display of Peptides 
and Proteins: A Laboratory Manual , Academic Press, Inc. (1996); 
Abelson et al . (eds.), Combinatorial Chemistry , Methods in 
Enzymology vol. 267, Academic Press (May 1996), the disclosures 
of which are incorporated herein by reference in their 
entireties . 

Typically, phage -displayed antibody fragments are 
scFv fragments or Fab fragments; when desired, full length 
antibodies can be produced by cloning the variable regions from 
the displaying phage into a complete antibody and expressing 
the full length antibody in a further prokaryotic or a 
eukaryotic host cell. 

Eukaryotic cells are also useful for expression of 
the antibodies, antibody fragments, and antibody derivatives of 
the present invention. 
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For example, antibody fragments of the present 
invention can be produced in Pichia pastoris, Takahashi et al . , 
Biosci. Biotechnol. Biochem. 64 (10) : 2138-44 (2000); Freyre et 
al., J. Biotechnol. 76 (2-3) : 157-63 (2000); Fischer et al . , 
Biotechnol. Appl . Biochem. 30 (Pt 2) .-117-20 (1999); Pennell et 
al., Res. Immunol. 149 (6 ): 599-603 (1998); Eldin et al . , J. 
Immunol. Methods. 201(1) :67-75 (1997); and in Saccharomyces 
cerevisiae, Frenken et al . , Res. Immunol. 149 ( 6 ): 589- 99 (1998); 
Shusta et al . , Nature Biotechnol. 16(8):773-7 (1998), the 
disclosures of which are incorporated herein by reference in 
their entireties. 

Antibodies, including antibody fragments and 
derivatives, of the present invention can also be produced in 
insect cells, Li et al., Protein Expr. Purif. 21(l):121-8 
(2001); Ailor et al . , Biotechnol. Bioeng. 58 (2 -3 ): 196 -203 

(1998) ; Hsu et al . , Biotechnol. Prog. 13(1): 96-104 (1997); 
Edelman et al., Immunology 91(l):13-9 (1997); and Nesbit et 
al., J. Immunol. Methods. 151 (1-2) : 201-8 (1992), the 
disclosures of which are incorporated herein by reference in 
their entireties. 

Antibodies and fragments and derivatives thereof of 
the present invention can also be produced in plant cells, 
Giddings et al . , Nature Biotechnol. 18 (11) : 1151-5 (2000); 
Gavilondo et al . , Biotechniques 29(l):128-38 (2000); Fischer et 
al., cJ. Bioi. Regul. Homeost. Agents 14 (2): 83-92 (2000); 
Fischer et al . , Biotechnol. Appl. Biochem. 30 (Pt 2):113-6 

(1999) ; Fischer et al . , Biol. Chem. 380 (7-8) : 825-39 (1999); 
Russell, Curr. Top. Microbiol. Immunol. 240:119-38 (1999); and 
Ma et al., Plant Physiol. 109 (2) :341-6 (1995), the disclosures 
of which are incorporated herein by reference in their 
entireties . 

Mammalian cells useful for recombinant expression of 
antibodies, antibody fragments, and antibody derivatives of the 
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present invention include CHO cells, COS cells, 293 cells, and 
myeloma cells. 

Verma et al . , J". Immunol. Methods 216 (1-2) : 165-81 
(1998), review and compare bacterial, yeast, insect and 
mammalian expression systems for expression of antibodies. 

Antibodies of the present invention can also be 
prepared by cell free translation, as further described in Merk 
et al., J. Biochem. (Tokyo). 125 (2 ): 328-33 (1999) and Ryabova 
et al., Nature Biotechnol . 15(1): 79-84 (1997), and in the milk 
of transgenic animals, as further described in Pollock et al . , 
J. Immunol. Methods 231(1-2) : 147 - 57 (1999), the disclosures of 
which are incorporated herein by reference in their entireties. 

The invention further provides antibody fragments 
that bind specifically to one or more of the proteins and 
protein fragments of the present invention, to one or more of 
the proteins and protein fragments encoded by the isolated 
nucleic acids of the present invention, or the binding of which 
can be competitively inhibited by one or more of the proteins 
and protein fragments of the present invention or one or more 
of the proteins and protein fragments encoded by the isolated 
nucleic acids of the present invention. 

Among such useful fragments are Fab, Fab', Fv, 
F(ab)' 2 # and single chain Fv (scFv) fragments. Other useful 
fragments are described in Hudson, Curr. Opin. Biotechnol. 
9(4) :395-402 (1998) . 

It is also an aspect of the present invention to 
provide antibody derivatives that bind specifically to one or 
more of the proteins and protein fragments of the present 
invention, to one or more of the proteins and protein fragments 
encoded by the isolated nucleic acids of the present invention, 
or the binding of which can be competitively inhibited by one 
or more of the proteins and protein fragments of the present 
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invention or one or more of the proteins and protein fragments 
encoded by the isolated nucleic acids of the present invention. 

Among such useful derivatives are. chimeric, 
primatized, and humanized antibodies; such derivatives are less 
immunogenic in human beings, and thus more suitable for in vivo 
administration, than are unmodified antibodies from non-human 
mammalian species. 

Chimeric antibodies typically include heavy and/or 
light chain variable regions (including both CDR and framework 
residues) of immunoglobulins of one species, typically mouse, 

f ncspH f- r\ rnnohan) - rpai on a r\ ■f annh nov crx^n-i oa t T.nr\ n pal 1 t t Vmman 

See, e.g., U.S. Pat. No. 5,807,715; Morrison et al . , Proc . 
Natl. Acad. Sci USA. 81 (21) : 6851-5 (1984); Sharon et al . , Nature 
309 (5966) :364-7 (1984); Takeda et al . , Nature 314 (6010) : 452-4 

(1985) , the disclosures of which are incorporated herein by 
reference in their entireties. Primatized and humanized 
antibodies typically include heavy and/or light chain CDRs from 
a murine antibody grafted into a non-human primate or human 
antibody V region framework, usually further comprising a human 
constant region, Riechmann et al . , Nature 332 { 6162 ): 323 -7 

(1988); Co et al . , Nature 351 (6326) : 501-2 (1991); U.S. Pat. 
Nos. 6,054,297; 5,821,337; 5,770,196; 5,766,886; 5,821,123; 
5,869,619; 6,180,377; 6,013,256; 5,693,761; and 6 , 180 , 370 , the 
disclosures of which are incorporated herein by reference in 
their entireties. 

Other useful antibody derivatives of the invention 
include heteromeric antibody complexes and antibody fusions, 
such as diabodies (bispecif ic antibodies) , single-chain 
diabodies, and intrabodies. 

The antibodies of the present invention, including 
fragments and derivatives thereof, can usefully be labeled. It 
is, therefore, another aspect of the present invention to 
provide labeled antibodies that bind specifically to one or 
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more of the proteins and protein fragments of the present 
invention, to one or more of the proteins and protein fragment 
encoded by the isolated nucleic acids of the present invention 
or the binding of which can be competitively inhibited by one 
or more of the proteins and protein fragments of the present 
invention or one or more of the proteins and protein fragments 
encoded by the isolated nucleic acids of the present invention 

The choice of label depends, in part, upon the 
desired use. 

For example, when the antibodies of the present 

■I r i- A — > i i ri r\ s3 ^/mi« ■! ryimi ■! nt-rti- 1V1 i 1 nf o i n i n /-r /~\ 4- f- -i o O t "l O 
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samples, the label can usefully be an enzyme that catalyzes 
production and local deposition of a detectable product. 

Enzymes typically conjugated to antibodies to permit 
their immunohistochemical visualization are well known, and 
include alkaline phosphatase, p-galactosidase , glucose oxidase 
horseradish peroxidase (HRP) , and urease. Typical substrates 
for production and deposition of visually detectable products 
include o-nitrophenyl-beta-D-galactopyranoside (ONPG) ; 
o-phenylenediamine dihydrochloride (OPD) ; p-nitrophenyl 
phosphate (PNPP) ; p-nitrophenyl -beta-D-galactopryanoside 
(PNPG) ; 3 1 , 3 1 -diaminobenzidine (DAB) ; 3 -amino-9-ethylcarbazole 
(AEC) ; 4-chloro-l-naphthol (CN) ; 

5-bromo-4-chloro-3-indolyl-phosphate (BCIP) ; ABTS®; BluoGal; 
iodonitrotetrazolium (INT) ; nitroblue tetrazolium chloride 
(NBT) ; phenazine methosulfate (PMS) ; phenolphthalein 
monophosphate (PMP) ; tetramethyl benzidine (TMB) ; 
tetranitroblue tetrazolium (TNBT) ; X-Gal; X-Gluc; and 
X-Glucoside . 

Other substrates can be used to produce products for 
local deposition that are luminescent. For example, in the 
presence of hydrogen peroxide (H 2 0 2 ) , horseradish peroxidase 
(HRP) can catalyze the oxidation of cyclic diacylhydrazides , 
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such as luminol. Immediately following the oxidation, the 
luminol is in an excited state (intermediate reaction product) , 
which decays to the ground state by emitting light. Strong 
enhancement of the light emission is produced by enhancers, 
5 such as phenolic compounds. Advantages include high 

sensitivity, high resolution, and rapid detection without 
radioactivity and requiring only small amounts of antibody. 
See, e.g., Thorpe et al . , Methods Enzymol . 133:331-53 (1986); 
Kricka et al . , J. Immunoassay 17(1): 67-83 (1996); and Lundqvist 
! = = 10 et al., J. Biolumin. Chemilumin. 10(6):353-9 (1995), the 
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p| their entireties. Kits for such enhanced chemiluminescent 

J s ] detection (ECL) are available commercially. 

ill The antibodies can also be labeled using colloidal 



s 



□ 



15 gold. 



C3 As another example, when the antibodies of the 

M 

; s| present invention are used, e.g., for flow cytometric 

□ detection, for scanning laser cytometric detection, or for 



fluorescent immunoassay, they can usefully be labeled with 
20 f luorophores . 

There are a wide variety of fluorophore labels that 
can usefully be attached to the antibodies of the present 
invention . 

For flow cytometric applications, both for 
25 extracellular detection and for intracellular detection, common 
useful f luorophores can be fluorescein isothiocyanate (FITC) , 
allophycocyanin (APC) , R-phycoerythrin (PE) , peridinin 
chlorophyll protein (PerCP) , Texas Red, Cy3 , Cy5, fluorescence 
resonance energy tandem f luorophores such as PerCP-Cy5.5, PE- 
30 Cy5, PE-Cy5.5, PE-Cy7, PE-Texas Red, and APC-Cy7. 

Other f luorophores include, inter alia, Alexa Fluor® 
350, Alexa Fluor® 488, Alexa Fluor® 532, Alexa Fluor® 546, 
Alexa Fluor® 568, Alexa Fluor® 594, Alexa Fluor® 647 
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(monoclonal antibody labeling kits available from Molecular 
Probes, Inc., Eugene, OR, USA), BODIPY dyes, such as BODIPY 
493/503, BODIPY FL, BODIPY R6G, BODIPY 530/550, BODIPY TMR, 
BODIPY 558/568, BODIPY 558/568, BODIPY 564/570, BODIPY 576/589, 
BODIPY 581/591, BODIPY TR, BODIPY 630/650, BODIPY 650/665, 
Cascade Blue, Cascade Yellow, Dansyl, lissamine rhodamine B, 
Marina Blue, Oregon Green 488, Oregon Green 514, Pacific Blue, 
rhodamine 6G, rhodamine green, rhodamine red, 
tetramethylrhodamine, Texas Red (available from Molecular 
Probes, Inc., Eugene, OR, USA), and Cy2 , Cy3 , Cy3 . 5 , Cy5, 
Cy5 . 5 , Cy7 , all of which are also useful for f lucre scent ly 
labeling the antibodies of the present invention. 

For secondary detection using labeled avidin, 
streptavidin, captavidin or neutravidin, the antibodies of the 
present invention can usefully be labeled with biotin. 

When the antibodies of the present invention are 
used, e.g., for western blotting applications, they can 
usefully be labeled with radioisotopes, such as 33 P, 32 P, 35 S, 
3 H, and 125 I. 

As another example, when the antibodies of the 
present invention are used for radioimmunotherapy , the label 
can usefully be 228 Th, 227 Ac, 225 Ac, 223 Ra, 213 Bi, 212 Pb, 212 Bi, 211 At, 
203 Pb, 194 0s, 188 Re, 186 Re, 153 Sm, 149 Tb, 131 I, 125 I, m In ; 105 Rh, 99m Tc, 
97 Ru, 90 Y, 90 Sr, 88 Y, 72 Se, 67 Cu, or 47 Sc. 

As another example, when the antibodies of the 
present invention are to be used for in vivo diagnostic use, 
they can be rendered detectable by conjugation to MR I contrast 
agents, such as gadolinium diethylenetriaminepentaacetic acid 
(DTPA) , Lauffer et al . , Radiology 207 (2) : 529-38 (1998), or by 
radioisotopic labeling 

As would be understood, use of the labels described 
above is not restricted to the application as for which they 
were mentioned. 
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The antibodies of the present invention, including 
fragments and derivatives thereof, can also be conjugated to 
toxins, in order to target the toxin's ablative action to cells 
that display and/or express the proteins of the present 
5 invention. Commonly, the antibody in such immunotoxins is 

conjugated to Pseudomonas exotoxin A, diphtheria toxin, shiga 
toxin A, anthrax toxin lethal factor, or ricin. See Hall 
(ed.), Immunotoxin Methods and Protocols (Methods in Molecular 
Biology, Vol 166), Humana Press (2000) ( ISBN : 0896037754 ) ; and 
10 Frankel et al . (eds.), Clinical Applications of Immunotoxins , 
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( ISBN: 3540640975) , the disclosures of which are incorporated 
herein by reference in their entireties, for review. 

The antibodies of the present invention can usefully 

15 be attached to a substrate, and it is, therefore, another 
aspect of the invention to provide antibodies that bind 
specifically to one or more of the proteins and protein 
fragments of the present invention, to one or more of the 
proteins and protein fragments encoded by the isolated nucleic 

20 acids of the present invention, or the binding of which can be 
competitively inhibited by one or more of the proteins and 
protein fragments of the present invention or one or more of 
the proteins and protein fragments encoded by the isolated 
nucleic acids of the present invention, attached to a 

25 substrate. 

Substrates can be porous or nonporous, planar or 

nonplanar . 

For example, the antibodies of the present invention 
can usefully be conjugated to filtration media, such as NHS- 
30 activated Sepharose or CNBr-activated Sepharose for purposes of 
immunoaf f inity chromatography. 

For example, the antibodies of the present invention 
can usefully be attached to paramagnetic microspheres, 



typically by biotin-streptavidin interaction, which microsphere 
can then be used for isolation of cells that express or display 
the proteins of the present invention. As another example, the 
antibodies of the present invention can usefully be attached to 
the surface of a microtiter plate for ELISA. 

As noted above, the antibodies of the present 
invention can be produced in prokaryotic and eukaryotic cells. 

It is, therefore, another aspect of the present invention to 
provide cells that express the antibodies of the present 
invention, including hybridoma cells, B cells, plasma cells, 
and host cells recombi nant ly modified to express the antibodies 
of the present invention. 

In yet a further aspect, the present invention 
provides aptamers evolved to bind specifically to one or more 
of the proteins and protein fragments of the present invention, 
to one or more of the proteins and protein fragments encoded by 
the isolated nucleic acids of the present invention, or the 
binding of which can be competitively inhibited by one or more 
of the proteins and protein fragments of the present invention 
or one or more of the proteins and protein fragments encoded by 
the isolated nucleic acids of the present invention. 

HTPL Antibodies 

In a first series of antibody embodiments, the 
invention provides antibodies, both polyclonal and monoclonal, 
and fragments and derivatives thereof, that bind specifically 
to a polypeptide having the amino acid sequence in SEQ ID NO: 
3, which are full length HTPL-L proteins. 

In a second series of antibody embodiments, the 
invention provides antibodies, both polyclonal and monoclonal, 
and fragments and derivatives thereof, that bind specifically 
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to a polypeptide having the amino acid sequence in SEQ ID NO: 
6, which are full length HTPL-S proteins. 

Such antibodies are useful in in vitro immunoassays, 
such as ELISA, western blot or immunohistochemical assay of 
testis or other disease tissue or cells. Such antibodies are 
also useful in isolating and purifying HTPL proteins, including 
related cross-reactive proteins, by immunoprecipitation, 
immunoaf f inity chromatography, or magnetic bead-mediated 
purification. 

In another series of antibody embodiments, the 
invention provides antibodies, both polyclonal and monoclonal, 
and fragments and derivatives thereof, that bind specifically 
to a polypeptide having an amino acid sequence in SEQ ID NO: 
14, which is unique to the HTPL-L isoform, and binding of which 
can be competitively inhibited by a polypeptide the sequence of 
which is given in SEQ ID NO: 3 and cannot be competitively 
inhibited by a polypeptide having the amino acid sequence of 
SEQ ID NO: 6 (the full length HTPL-S protein). 

Such antibodies are useful in in vitro immunoassays, 
such as ELISA, western blot or immunohistochemical assay of 
testis or other disease tissue or cells. Such antibodies are 
also useful in isolating and purifying HTPL-L protein, 
including related cross-reactive proteins, by 

immunoprecipitation, immunoaf f inity chromatography, or magnetic 
bead-mediated purification. 

In another series of antibody embodiments, the 
invention provides antibodies, both polyclonal and monoclonal, 
and fragments and derivatives thereof, the specific binding of 
which can be competitively inhibited by the isolated proteins 
and polypeptides of the present invention. 

In other embodiments, the invention further provides 
the above-described antibodies detectably labeled, and in yet 
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other embodiments, provides the above -described antibodies 
attached to a substrate. 

PHARMACEUTICAL COMPOSITIONS 

HTPL is important in regulating male germ cell 
development and as a tumor suppressor; defects in HTPL 
expression, activity, distribution, localization, and/or 
solubility are a cause of human disease, which disease can 
manifest as a disorder of testis, or adrenal, adult and fetal 
——■'-■»-# mwi.xv«v, ^axn, jvxuncy , xuny , pxciutiiiLci , proscace, 

skeletal muscle or colon function. 

Accordingly, pharmaceutical compositions comprising 
nucleic acids, proteins, and antibodies of the present 
invention, as well as mimetics, agonists, antagonists, or 
inhibitors of HTPL activity, can be administered as 
therapeutics for treatment of HTPL defects. 

Thus, in another aspect, the invention provides 
pharmaceutical compositions comprising the nucleic acids, 
nucleic acid fragments, proteins, protein fusions, protein 
fragments, antibodies, antibody derivatives, antibody 
fragments, mimetics, agonists, antagonists, and inhibitors of 
the present invention. 

Such a composition typically contains from about 0.1 
to 90% by weight of a therapeutic agent of the invention 
formulated in and/or with a pharmaceutically acceptable carrier 
or excipient. 

Pharmaceutical formulation is a well-established art, 
and is further described in Gennaro (ed.), Remington: The 
Science and Practice of Pharmacy , 20 th ed., Lippincott, 
Williams & Wilkins (2000) (ISBN: 0683306472); Ansel et al . , 
Pharmaceutical Dosage Forms and Drug Delivery Systems , 7 th ed. , 
Lippincott Williams & Wilkins Publishers (1999) (ISBN: 
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0683305727); and Kibbe (ed.), Handbook of Pharmaceutical 
Excipients American Pharmaceutical Association, 3 rd ed. (2000) 
(ISBN: 091733096X), the disclosures of which are incorporated 
herein by reference in their entireties, and thus need not be 
described in detail herein. 

Briefly, however, formulation of the pharmaceutical 
compositions of the present invention will depend upon the 
route chosen for administration. The pharmaceutical 
compositions utilized in this invention can be administered by 
various routes including both enteral and parenteral routes, 

■i nr<1 ii^i nrr ■» ■»-» +- -v ^^-» ^-v. . — ,' . . n ,i- . • 

-...va.wwj.ny wj. ux , xiiui.avciiyuo, J- 11 CJ. cull u to U U x cl JL , tjUJJUULaiieOUS , 

inhalation, topical, sublingual, rectal, intra-arterial , 
intramedullary, intrathecal , intraventricular, transmucosal , 
transdermal, intranasal, intraperitoneal, intrapulmonary , and 
intrauterine . 

Oral dosage forms can be formulated as tablets, 
pills, dragees, capsules, liquids, gels, syrups, slurries, 
suspensions, and the like, for ingestion by the patient. 

Solid formulations of the compositions for oral 
administration can contain suitable carriers or excipients, 
such as carbohydrate or protein fillers, such as sugars, 
including lactose, sucrose, mannitol, or sorbitol; starch from 
corn, wheat, rice, potato, or other plants; cellulose, such as 
methyl cellulose, hydroxypropylmethyl - cellulose , sodium 
carboxymethylcellulose, or microcrystalline cellulose; gums 
including arabic and tragacanth; proteins such as gelatin and 
collagen; inorganics, such as kaolin, calcium carbonate, 
dicalcium phosphate, sodium chloride; and other agents such as 
acacia and alginic acid. 

Agents that facilitate disintegration and/or 
solubilization can be added, such as the cross-linked polyvinyl 
pyrrolidone, agar, alginic acid, or a salt thereof, such as 



123 



sodium alginate, microcrystalline cellulose, corn starch, 
sodium starch glycolate, and alginic acid. 

Tablet binders that can be used include acacia, 
methylcellulose, sodium carboxymethylcellulose , 
polyvinylpyrrolidone (Povidone™) , hydroxypropyl 
methylcellulose, sucrose, starch and ethylcellulose . 

Lubricants that can be used include magnesium 
stearates, stearic acid, silicone fluid, talc, waxes, oils, and 
colloidal silica. 

Fillers, agents that facilitate disintegration and/or 
solubilization, tablet binders and lubricants, including the 
aforementioned, can be used singly or in combination. 

Solid oral dosage forms need not be uniform 
throughout . 

For example, dragee cores can be used in conjunction 
with suitable coatings, such as concentrated sugar solutions, 
which can also contain gum arabic, talc, polyvinylpyrrolidone, 
carbopol gel, polyethylene glycol, and/or titanium dioxide, 
lacquer solutions, and suitable organic solvents or solvent 
mixtures . 

Oral dosage forms of the present invention include 
push-fit capsules made of gelatin, as well as soft, sealed 
capsules made of gelatin and a coating, such as glycerol or 
sorbitol. Push-fit capsules can contain active ingredients 
mixed with a filler or binders, such as lactose or starches, 
lubricants, such as talc or magnesium stearate, and, 
optionally, stabilizers. In soft capsules, the active 
compounds can be dissolved or suspended in suitable liquids, 
such as fatty oils, liquid, or liquid polyethylene glycol with 
or without stabilizers. 

Additionally, dyestuffs or pigments can be added to 
the tablets or dragee coatings for product identification or to 
characterize the quantity of active compound, i.e., dosage. 
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Liquid formulations of the pharmaceutical 
compositions for oral (enteral) administration are prepared in 
water or other aqueous vehicles and can contain various 
suspending agents such as methylcellulose , alginates, 
tragacanth, pectin, kelgin, carrageenan, acacia, 
polyvinylpyrrolidone, and polyvinyl alcohol. The liquid 
formulations can also include solutions, emulsions, syrups and 
elixirs containing, together with the active compound (s), 
wetting agents, sweeteners, and coloring and flavoring agents. 

The pharmaceutical compositions of the present 

. -w*a v-^ii ^.^^.^ xuiinuxaLcu paicliLCldl ctUUlJLll-L hi L fdClOn . 

For intravenous injection, water soluble versions of 
the compounds of the present invention are formulated in, or if 
provided as a lyophilate, mixed with, a physiologically 
acceptable fluid vehicle, such as 5% dextrose ("D5" ), 
physiologically buffered saline, 0.9% saline, Hanks' solution, 
or Ringer's solution. 

Intramuscular preparations, e.g. a sterile 
formulation of a suitable soluble salt form of the compounds of 
the present invention, can be dissolved and administered in a 
pharmaceutical excipient such as Water-f or- Inj ection, 0.9% 
saline, or 5% glucose solution. Alternatively, a suitable 
insoluble form of the compound can be prepared and administered 
as a suspension in an aqueous base or a pharmaceutical^ 
acceptable oil base, such as an ester of a long chain fatty 
acid (e.g., ethyl oleate) , fatty oils such as sesame oil, 
triglycerides, or liposomes. 

Parenteral formulations of the compositions can 
contain various carriers such as vegetable oils, 
dimethylacetamide, dimethylf ormamide , ethyl lactate, ethyl 
carbonate, isopropyl myristate, ethanol, polyols (glycerol, 
propylene glycol, liquid polyethylene glycol, and the like) . 
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Aqueous injection suspensions can also contain 
substances that increase the viscosity of the suspension, such 
as sodium carboxymethyl cellulose, sorbitol, or dextran. 
Non- lipid polycationic amino polymers can also be used for 
5 delivery. Optionally, the suspension can also contain suitable 
stabilizers or agents that increase the solubility of the 
compounds to allow for the preparation of highly concentrated 
solutions . 

Pharmaceutical compositions of the present invention 
10 can also be formulated to permit injectable, long-term, 
deposition. 

The pharmaceutical compositions of the present 
invention can be administered topically. 

A topical semi-solid ointment formulation typically 
15 contains a concentration of the active ingredient from about 1 
to 20%, e.g., 5 to 10%, in a carrier such as a pharmaceutical 
cream base. Various formulations for topical use include 
drops, tinctures, lotions, creams, solutions, and ointments 
containing the active ingredient and various supports and 
20 vehicles. In other transdermal formulations, typically in 
patch-delivered formulations, the pharmaceutically active 
compound is formulated with one or more skin penetrants, such 
as 2-N-methyl-pyrrolidone (NMP) or Azone. 

Inhalation formulations can also readily be 
25 formulated. For inhalation, various powder and liquid 
formulations can be prepared. 

The pharmaceutically active compound in the 
pharmaceutical compositions of the present inention can be 
provided as the salt of a variety of acids, including but not 
30 limited to hydrochloric, sulfuric, acetic, lactic, tartaric, 
malic, and succinic acid. Salts tend to be more soluble in 
aqueous or other protonic solvents than are the corresponding 
free base forms. 



After pharmaceutical compositions have been prepared, 
they are packaged in an appropriate container and labeled for 
treatment of an indicated condition. 

The active compound will be present in an amount 
5 effective to achieve the intended purpose. The determination 
of an effective dose is well within the capability of those 
skilled in the art. 

A "therapeutically effective dose" refers to that 
amount of active ingredient - for example HTPL protein, fusion 

10 protein, or fragments thereof, antibodies specific for HTPL, 

agonists, antagonists or inhibitors of HTPL - which ameliorates 
the signs or symptoms of the disease or prevents progression 
thereof; as would be understood in the medical arts, cure, 
although desired, is not required. 

15 The therapeutically effective dose of the 

pharmaceutical agents of the present invention can be estimated 
initially by in vitro tests, such as cell culture assays, 
followed by assay in model animals, usually mice, rats, 
rabbits, dogs, or pigs. The animal model can also be used to 

20 determine an initial useful concentration range and route of 
administration . 

For example, the ED50 (the dose therapeutically 
effective in 50% of the population) and LD50 (the dose lethal 
to 50% of the population) can be determined in one or more cell 

25 culture of animal model systems. The dose ratio of toxic to 
therapeutic effects is the therapeutic index, which can be 
expressed as LD50/ED50. Pharmaceutical compositions that 
exhibit large therapeutic indices are particularly useful. 

The data obtained from cell culture assays and animal 

30 studies is used in formulating an initial dosage range for 
human use, and preferably provides a range of circulating 
concentrations that includes the ED50 with little or no 
toxicity. After administration, or between successive 



administrations, the circulating concentration of active agent 
varies within this range depending upon pharmacokinetic factors 
well known in the art, such as the dosage form employed, 
sensitivity of the patient, and the route of administration. 
5 The exact dosage will be determined by the 

practitioner, in light of factors specific to the subject 
requiring treatment. Factors that can be taken into account by 
the practitioner include the severity of the disease state, 
general health of the subject, age, weight, gender of the 
10 subject, diet, time and frequency of administration, drug 

combination (s) , reaction sensitivities, and tolerance/response 
to therapy. Long-acting pharmaceutical compositions can be 
administered every 3 to 4 days, every week, or once every two 
weeks depending on half -life and clearance rate of the 
f?1 15 particular formulation. 

Normal dosage amounts may vary from 0.1 to 100,000 
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u micrograms, up to a total dose of about 1 g, depending upon the 
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route of administration. Where the therapeutic agent is a 



0 protein or antibody of the present invention, the therapeutic 

ffl 

'"' 20 protein or antibody agent typically is administered at a daily 
dosage of 0.01 mg to 30 mg/kg of body weight of the patient 
(e.g., lmg/kg to 5 mg/kg). The pharmaceutical formulation can 
be administered in multiple doses per day, if desired, to 
achieve the total desired daily dose. 

25 Guidance as to particular dosages and methods of 

delivery is provided in the literature and generally available 
to practitioners in the art. Those skilled in the art will 
employ different formulations for nucleotides than for proteins 
or their inhibitors. Similarly, delivery of polynucleotides or 

30 polypeptides will be specific to particular cells, conditions, 
locations, etc. 

Conventional methods, known to those of ordinary 
skill in the art of medicine, can be used to administer the 
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pharmaceutical f ormulation (s) of the present invention to the 
patient. The pharmaceutical compositions of the present 
invention can be administered alone, or in combination with 
other therapeutic agents or interventions. 

5 

THERAPEUTIC METHODS 

The present invention further provides methods of 

treating subjects having defects in HTPL - e.g., in expression, 

10 activity, distribution, localization, and/or solubility of 

HTPL — which can manifest as a disorripr nf tPshiQ. nr ^Hr-^r^i 

— , — — , 

adult and fetal liver, bone marrow, brain, kidney, lung, 

placenta, prostate, skeletal muscle or colon function. As used 

herein, "treating" includes all medically-acceptable types of 

15 therapeutic intervention, including palliation and prophylaxis 

(prevention) of disease. 

In one embodiment of the therapeutic methods of the 

present invention, a therapeutically effective amount of a 

pharmaceutical composition comprising HTPL protein, fusion, 

20 fragment or derivative thereof is administered to a subject 

with a clinically-significant HTPL defect. 

Protein compositions are administered, for example, 

to complement a deficiency in native HTPL. In other 

embodiments, protein compositions are administered as a vaccine 

25 to elicit a humoral and/or cellular immune response to HTPL. 

The immune response can be used to modulate activity of HTPL 

or, depending on the immunogen, to immunize against aberrant or 

aberrantly expressed forms, such as mutant or inappropriately 

expressed isoforms. In yet other embodiments, protein fusions 

30 having a toxic moiety are administered to ablate cells that 

aberrantly accumulate HTPL. 

In another embodiment of the therapeutic methods of 

the present invention, a therapeutically effective amount of a 
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pharmaceutical composition comprising nucleic acid of the 
present invention is administered. The nucleic acid can be 
delivered in a vector that drives expression of HTPL protein, 
fusion, or fragment thereof, or without such vector. 

Nucleic acid compositions that can drive expression 
of HTPL are administered, for example, to complement a 
deficiency in native HTPL, or as DNA vaccines. Expression 
vectors derived from virus, replication deficient retroviruses, 
adenovirus, adeno-associated (AAV) virus, herpes virus, or 
vaccinia virus can be used - see, e.g., Cid-Arregui (ed.), 

-r T J 1 tt__j T-J-.^,^^. Q ^. - ~»-> >— ^ O ^ TVi ^ -> v-nt r T? c> t~ /-NT-( Dl 1 1 •? cjVit m r-T 

Vlldl Da oXL. Ol- ICii^C gild OCii^ mLm^/jr , i. 

Co., 2000 (ISBN: 188129935X) - as can plasmids . 

Antisense nucleic acid compositions, or vectors that 
drive expression of HTPL antisense nucleic acids, are 
administered to downregulate transcription and/or translation 
of HTPL in circumstances in which excessive production, or 
production of aberrant protein, is the pathophysiologic basis 
of disease. 

Antisense compositions useful in therapy can have 
sequence that is complementary to coding or to noncoding 
regions of the HTPL gene. For example, oligonucleotides 
derived from the transcription initiation site, e.g., between 
positions -10 and +10 from the start site, are particularly 
useful . 

Catalytic antisense compositions, such as ribozymes, 
that are capable of sequence-specific hybridization to HTPL 
transcripts, are also useful in therapy. See, e.g., Phylactou, 
Adv. Drug Deliv. Rev. 44 (2-3) : 97-108 (2000); Phylactou et al . , 
Hum. Mol. Genet. 7 (10 ): 1649-53 (1998); Rossi, Ciba Found. Symp. 
209:195-204 (1997); and Sigurdsson et al . , Trends Biotechnol . 
13(8):286-9 (1995), the disclosures of which are incorporated 
herein by reference in their entireties. 



130 



1 1 



Other nucleic acids useful in the therapeutic methods 
of the present invention are those that are capable of triplex 
helix formation in or near the HTPL genomic locus. Such 
triplexing oligonucleotides are able to inhibit transcription, 
5 Intody et al . , Nucleic Acids Res. 28 (21) : 4283 - 90 (2000); 
McGuffie et al . , Cancer Res. 60 (14 ): 3790- 9 (2000), the 
disclosures of which are incorporated herein by reference, and 
pharmaceutical compositions comprising such triplex forming 
oligos (TFOs) are administered in circumstances in which 
10 excessive production, or production of aberrant protein, is a 
p pathophysiologic basis of disease. 

0} In another embodiment of the therapeutic methods of 

Sj the present invention, a therapeutically effective amount of a 

pharmaceutical composition comprising an antibody (including 
15 fragment or derivative thereof) of the present invention is 
administered. As is well known, antibody compositions are 
Ul administered, for example, to antagonize activity of HTPL, or 

;i to target therapeutic agents to sites of HTPL presence and/or 

ill accumulation. 

20 In another embodiment of the therapeutic methods of 

the present invention, a pharmaceutical composition comprising 
a non-antibody antagonist of HTPL is administered. Antagonists 
of HTPL can be produced using methods generally known in the 
art. In particular, purified HTPL can be used to screen 
25 libraries of pharmaceutical agents, often combinatorial 
libraries of small molecules, to identify those that 
specifically bind and antagonize at least one activity of HTPL. 

In other embodiments a pharmaceutical composition 
comprising an agonist of HTPL is administered. Agonists can be 
30 identified using methods analogous to those used to identify 
antagonists . 

In still other therapeutic methods of the present 
invention, pharmaceutical compositions comprising host cells 
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that express HTPL, fusions, or fragments thereof can be 
administered. In such cases, the cells are typically 
autologous, so as to circumvent xenogeneic or allotypic 
rejection, and are administered to complement defects in HTPL 
production or activity. 

In other embodiments, pharmaceutical compositions 
comprising the HTPL proteins, nucleic acids, antibodies, 
antagonists, and agonists of the present invention can be 
administered in combination with other appropriate therapeutic 
agents. Selection of the appropriate agents for use in 
comb inat ion therapy can be made by one of ordinary skill in th 
art according to conventional pharmaceutical principles. The 
combination of therapeutic agents or approaches can act 
additively or synergistically to effect the treatment or 
prevention of the various disorders described above, providing 
greater therapeutic efficacy and/or permitting use of the 
pharmaceutical compositions of the present invention using 
lower dosages, reducing the potential for adverse side effects 

TRANSGENIC ANIMALS AND CELLS 

In another aspect, the invention provides transgenic 
cells and non-human organisms comprising HTPL isoform nucleic 
acids, and transgenic cells and non-human organisms with 
targeted disruption of the endogenous orthologue of the human 
HTPL gene. 

The cells can be embryonic stem cells or somatic 
cells. The transgenic non-human organisms can be chimeric, 
nonchimeric heterozygotes , and nonchimeric homozygotes . 
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DIAGNOSTIC METHODS 



The nucleic acids of the present invention can be 
used as nucleic acid probes to assess the levels of HTPL mRNA 
in testis or other disease tissue or cells, and antibodies of 
the present invention can be used to assess the expression 
levels of HTPL proteins in testis or other disease tissue or 
cells to diagnose male infertility or cancer. 

The following examples are offered for purpose of 
illustration, not limitation. 

EXAMPLE 1 

Identification and Characterization of 
cDNAs Encoding HTPL Proteins 

Predicating our gene discovery efforts on use of 
genome -derived single exon probes and hybridization to genome- 
derived single exon microarrays - an approach that we have 
previously demonstrated will readily identify novel genes that 
have proven refractory to mRNA-based identification efforts - 
we identified an exon in raw human genomic sequence that is 
particularly expressed in human adrenal, adult and fetal liver, 
bone marrow, brain, kidney, lung, placenta and prostate. 

Briefly, bioinf ormatic algorithms were applied to 
human genomic sequence data to identify putative exons . Each 
of the predicted exons was amplified from genomic DNA, 
typically centering the putative coding sequence within a 
larger amplicon that included flanking noncoding sequence. 
These genome -derived single exon probes were arrayed on a 
support and expression of the bioinf ormatically predicted exons 
assessed through a series of simultaneous two-color 
hybridizations to the genome -derived single exon microarrays. 

The approach and procedures are further described in 
detail in Penn et al . , "Mining the Human Genome using 
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Microarrays of Open Reading Frames, " Nature Genetics 26:315-318 
(2000); commonly owned and copending U.S. patent application 
nos. 09/864,761, filed May 23, 2001, 09/774,203, filed January 
29, 2001, and 09/632,366, filed August 3, 2000, the disclosures 
of which are incorporated herein by reference in their 
entireties . 

Using a graphical display particularly designed to 
facilitate computerized query of the resulting exon-specif ic 
expression data, as further described in commonly owned and 
copending U.S. patent application no. 09/864,761, filed May 23, 
2001, 09/774,203, filed January 29, 2 0 01 and 09/632,366, filed 
August 3, 2 000, the disclosures of which are incorporated 
herein by reference in their entireties, one exon was 
identified that is expressed in all the human tissues tested; 
subsequent analysis revealed that the exon represent a gene. 

Table 1 summarizes the microarray expression data 
obtained using genome -derived single exon probe corresponding 
to exon four. The probe was completely sequenced on both 
strands prior to its use on a genome -derived single exon 
microarray; sequencing confirmed the exact chemical structure 
of the probe. An added benefit of sequencing is that it placed 
us in possession of a set of single base- incremented fragments 
of the sequenced nucleic acid, starting from the sequencing 
primer's 3' OH. (Since the single exon probe was first 
obtained by PCR amplification from genomic DNA, we were of 
course additionally in possession of an even larger set of 
single base incremented fragments of the single exon probe, 
each fragment corresponding to an extension product from one of 
the two amplification primers.) 

Signals and expression ratios are normalized values 
measured and calculated as further described in commonly owned 
and copending U.S. patent application nos. 09/864,761, filed 
May 23, 2001, 09/774,203, filed January 29, 2001, 09/632,366, 
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filed August 3, 2000, and U.S. provisional patent application 
no. 60/207,456, filed May 26, 2000, the disclosures of which 
are incorporated herein by reference in their entireties. 



Table 1 

Expression Analysis 

Genome -Derived Single Exon Microarray 




Amplicon 86654, Exon 4 


(TISSUE) 


Signal 


Ratio 


Adrenal 


1. 02 


1 . 06 


adult liver 


0.95 


1.00 


bone marrow 


1.63 


1.01 


Brain 


1.20 


1.27 


fetal liver 


1.22 


-1.22 


Kidney 


0.90 


1.07 


Lung 


0.96 


-1.11 


Placenta 


1. 02 


-1.05 


Prostate 


0.85 


1.18 



As shown in Table 1, low level expression of exon 
four was seen in adrenal, adult and fetal liver, bone marrow, 
brain, kidney, lung, placenta and prostate. Low level 
expression of HTPL in these tissues was further confirmed by 
RT-PCR analysis (see below) . 

Rapid Amplification of cDNA ends (RACE, Clontech 
Laboratories, Palo Alto, CA) and direct RT-PCR were used to 
obtain the coding region of the human cDNA sequence. The final 
pair of primers used to amplify the 3.3 kb transcript were: 
62NF3 (5' -CAGGAAACCGTCTGGTGGGATCTC-3' ; SEQ ID NO: 4801) and 
62487Rend (5' -CTGAGACGGAGTCTCATTCTTGTCACC-3' ; SEQ ID NO: 4802). 

Human testis Marathon ready cDNA (Clontech Laboratories) was 
used to clone the entire coding region. The PCR parameters were 
as follows: 94°C 1 min; (94°C 10 seconds; 64°C 30 seconds; 72°C 
3 min.) for 35 cycles. The PCR reaction composition was as 
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follows: 5 ul of cDNA; 5 ul of 10X amplification buffer; 1 ul 
dNTP(10 M) ; 3 ul of primer pairs; 1 ul of Taq polymerase Mix 
and 35 ul double distilled water. 

The PCR product was cloned into a pGEM-Teasy vector 
and inserts of multiple clones were sequenced on both strands 
using a MegaBACE™ automatic sequencer (Amersham Biosciences, 
Sunnyvale, CA) . Single base pair sequence changes were 
identified between two groups of clones, and each group was 
recognized as one isoform. The presence of two isoforms was 
confirmed by the presence of two genomic clones (see below) , in 
the public database,- with the same single base pair sequence 
changes between them. For reasons described below, we named the 
two isoforms HTPL-L and HTPL-S. Sequencing both strands 
provided us with the exact chemical structure of the cDNA, 
which are shown in FIG. 3 and FIG. 4 and further presented in 
the SEQUENCE LISTING as SEQ ID NOs : 1 and 4, and placed us in 
actual physical possession of the entire set of single-base 
incremented fragments of the sequenced clones, starting at the 
5 1 and 3 * termini . 

As shown in FIG. 3, the HTPL-L cDNA spans 3296 
nucleotides and contains an open reading frame from nucleotide 
78 through and including nt 2942 (inclusive of termination 
codon) , predicting a protein of 954 amino acids with a 
(posttranslationally unmodified) molecular weight of 107.6 kD. 
The clone appears full length, with the reading frame opening 
starting with a methionine and terminating with a stop codon. 

As shown in FIG. 4, the HTPL-S cDNA span 3298 
nucleotides and contains an open reading frame from nucleotide 
78 through and including nt 2381 (inclusive of termination 
codon) , predicting a protein of 767 amino acids with a 
(posttranslationally unmodified) molecular weight of 86.9 kD. 
The clone appears full length, with the reading frame opening 
starting with a methionine and terminating with a stop codon. 
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BLAST query of genomic sequence identified one BAC, 
spanning 16.7 kb, which constitutes the minimum set of clones 
encompassing the cDNA sequences. Based upon the known origin 
of the BAC (GenBank accession number AC005875.2), the HTPL gene 
can be mapped to human chromosome 10pl2 . 1 . 

Comparison of the cDNA and genomic sequences 
identified four exons for HTPL. Exon organization for HTPL is 
listed in Table 2 . 



Table 2 

KTPL Exon Structure 


Exon no. 


cDNA range 


genomic range 


BAC accession 


1 


1-1166 


100221-99056 


AC005875.2 


2 


1167-1288 


97823-97702 




3 


1289-1434 


89251-89106 




4 


1435-3296 (HTPL-L) 
1435-3298 (HTPL-S) 


85101-83240 
85101-83238 





FIG. 2 schematizes the exon organization of the HTPL 

gene . 

At the top is shown the bacterial artificial 
chromosome (BAC) , with GenBank accession number (AC005875 . 2 ) , 
that span the HTPL locus. The genome -derived single-exon probe 
first used to demonstrate expression from this locus is shown 
below the BAC and labeled "500". The 500 bp probe includes 
sequence drawn solely from exon four. 

As shown in FIG. 2, two HTPL isoforms have been 
identified. Both isoforms contain four exons, with a few single 
base pair differences between them (FIG. 3 and FIG. 4) . Indeed, 
BAC AC005875.2 contains the same exonic sequence as HTPL-L, 
while BAC AL355493.1 contains the same exonic sequence as HTPL- 
S. The longer isoform, HTPL-L, encodes a protein of 954 amino 
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acids that has a predicted molecular weight, prior to any post- 
translational modification, of 107.6 kD. One of the single base 
pair changes in the shorter form (HTPL-S) introduces a 
premature stop codon at position 2379 of the HTPL-S cDNA . HTPL- 
S, therefore, encodes a shortened protein of 767 amino acids 
that has a predicted molecular weight, prior to any post- 
translational modification, of 86.9 kD. Both cDNA clones appear 
full length, with the open reading frame starting with a 
methionine and terminating with a stop codon. 

As further discussed in the examples herein, 
expression of HTPL was assessed using hybridisation to g^nom^- 
derived single exon microarrays. Microarray analysis of the 
fourth exon showed low level expression in all tissues tested, 
including adrenal, adult liver, bone marrow, brain, fetal 
liver, kidney, lung, placenta and prostate. This was confirmed 
by RT-PCR. RT-PCR also detected strong expression in testis and 
weak expression in colon and skeletal muscle {see Example 3) . 

The sequence of the HTPL cDNA was used as a BLAST 
query into the GenBank nr and dbEst databases . The nr database 
includes all non-redundant GenBank coding sequence 
translations, sequences derived from the 3 -dimensional 
structures in the Brookhaven Protein Data Bank (PDB) , sequences 
from SwissProt, sequences from the protein information resource 
(PIR) , and sequences from protein research foundation (PRF) . 
The dbEst (database of expressed sequence tags) includes ESTs, 
short, single pass read cDNA (mRNA) sequences, and cDNA 
sequences from differential display experiments and RACE 
experiments . 

BLAST search identified a single human EST 
(AW665031. 1) , a single mouse EST (AV280614 . 1) , and a single pig 
EST (AW436721.1) as having sequence closely related to HTPL. 

Globally, the human HTPL proteins resemble a putative 
mouse transcript (GenBank accession: BAB29848, the HTPL-L 
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protein with 66 % amino acid identity and 78 % amino acid 
similarity over the entire open reading frame) . 

Motif searches using Pfam (http://pfam.wustl.edu), 
SMART (http://smart.embl- heidelberg.de), and PROSITE pattern 
5 and profile databases (http://www.expasy.ch/prosite), 

identified several known domains shared with Patched, including 
the Patched domain and the Sterol -sensing domain. 

FIG. 1 shows the domain structure of HTPL-L and HTPL- 
S as well as the alignment of the Patched domain of HTPL-L with 

5 L 

r| 10 that of other protein. 

*" As schematized in FTO. 1 . HTPL shares an overall 

€lt " " 

CJ structural organization with the Patched protein. The shared 

^: structural features strongly imply that HTPL plays a role 

Qfl similar to that of Patched, in male germ cell development, and 

?=j 15 is a potential tumor suppressor. 

M Like Patched, HTPL-L contains a Patched domain 

Eii 

(http://pfam.wustl.edu/hmmsearch.shtml), a Sterol -sensing 



P domain (SSD, http://motif.genome.ad.jp/) and twelve 

transmembrane domains (http://smart.embl- 

20 heidelberg.de/smart/show_motifs.pl) . The Patched domain in 

HTPL-L covers amino acid sequences 166 - 952 of HTPL-L. The SSD 
domain in HTPL-L covers amino acid sequences 383 - 540 of HTPL- 
L. The presence of these domains in HTPL-L suggest that HTPL-L, 
like Patched and other Patched domain containing proteins, is 

25 involved in the Hedgehog signaling pathway (see background 

section) . Because of the premature stop of protein translation, 
HTPL-S contains a partial Patched domain, a complete Sterol - 
sensing motif and seven transmembrane domains. The presence of 
these domains in HTPL-S suggests that HTPL-S is also involved 

30 in the Hedgehog signaling pathway. 

Other signatures of the newly isolated HTPL proteins 
were identified by searching the PROSITE database 
(http://www.expasy.ch/tools/scnpsitl.html), and the list below 
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is for both HTPL-L and HTPL-S unless specified otherwise. These 
include seven W-glycosylation sites (192 - 195, 275 - 278, 279 

- 282, 530 - 533, 678 - 681, 692 - 695 and 737 - 740), one 
cAMP- and cGMP- dependent protein kinase phosphorylation site 
(201 - 204), seven protein kinase C phosphorylation sites (194 

- 196, 200 - 202, 508 - 510, 561 - 563, 662 - 664, 746 - 748, 
and 759 - 761; plus one for HTPL-L at 800 - 802), twelve Casein 
kinase II phosphorylation sites (19 - 22, 36 - 39, 62 - 65, 79 

- 82, 190 - 193, 215 - 218, 219 - 222, 225 - 228, 230 - 233, 
572 - 575, 597 - 600, and 740 - 743), two tyrosine kinase 
phosphorylation site (329 - 335, and 681 - 688; plus one for 
HTPL-L at 887 - 893) , four N-myristoylat ion sites (307 - 312, 
418 - 4223, 504 - 509, and 535 - 540; plus one for HTPL-L at 
935 - 940), and a single amidation site at 541 -544. 

Possession of the genomic sequence permitted search 
for promoter and other control sequences for the HTPL gene . 
A putative transcriptional control region, inclusive of 
promoter and downstream elements, was defined as 1 kb around 
the transcription start site, itself defined as the first 
nucleotide of the HTPL cDNA clone. The region, drawn from 
sequence of BAC AC005875.2, has the sequence given in SEQ ID 
NO: 23, which lists 1000 nucleotides before the transcription 
start site. 

Transcription factor binding sites were identified 
using a web based program (http://motif.genome.ad.jp/), 
including binding sites for homeo domain factor Nkx-2.5/Csx 
(625 - 631 bp), for USF (891 - 898 bp) and for CdxA (399-405 
and 612 - 618 bp, with numbering according to SEQ ID NO: 23) , 
amongst others. 

We have thus identified a newly described human gene, 
HTPL, which shares certain protein domains and an overall 
structural organization with Patched. The shared structural 
features strongly imply that the HTPL protein plays a role 
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similar to Patched, in the hedgehog signaling pathway, 
functioning as a potential tumor suppressor and in germ cell 
development. This makes the HTPL proteins and nucleic acids 
clinically useful diagnostic markers and potential therapeutic 
agents for male infertility and cancer. 

EXAMPLE 2 
Preparation and Labeling of 
Useful Fragments of HTPL 

Useful fragments of HTPL are produced by PCR, using 
standard techniques, or solid phase chemical synthesis using an 
automated nucleic acid synthesizer. Each fragment is 
sequenced, confirming the exact chemical structure thereof. 

The exact chemical structure of preferred fragments 
is provided in the attached SEQUENCE LISTING , the disclosure of 
which is incorporated herein by reference in its entirety. The 
following summary identifies the fragments whose structures are 
more fully described in the SEQUENCE LISTING: 



SEQ 


ID 


NO: 


1 


(nt, 


full length HTPL-L cDNA) 


SEQ 


ID 


NO: 


2 


(nt, 


CDNA ORF of HTPL-L) 


SEQ 


ID 


NO: 


3 


(aa, 


full length HTPL-L protein) 


SEQ 


ID 


NO: 


4 


(nt, 


full length HTPL-S cDNA) 


SEQ 


ID 


NO: 


5 


(nt, 


cDNA ORF of HTPL-S) 


SEQ 


ID 


NO: 


6 


(aa, 


full length HTPL-S protein) 


SEQ 


ID 


NO: 


7 


(nt, 


(nt 1 - 2021) portion of HTPL-L) 


SEQ 


ID 


NO: 


8 


(nt, 


5 r UT portion of SEQ ID NO: 7) 


SEQ 


ID 


NO: 


9 


(nt, 


coding region of SEQ ID NO: 7) 


SEQ 


ID 


NO: 


10 


(aa, 


(aa 1 - 648) CDS entirely within SEQ ID NO: 9) 


SEQ 


ID 


NO: 


11 


(nt, 


(nt 2637 - 3041) portion of HTPL-L) 


SEQ 


ID 


NO: 


12 


(nt, 


coding region of SEQ ID NO: 11) 


SEQ 


ID 


NO: 


13 


(nt, 


3' UT portion of SEQ ID NO: 11) 


SEQ 


ID 


NO: 


14 


(aa, 


(aa 854 - 954) CDS entirely within SEQ ID NO: 












12) 


SEQ 


ID 


NO: 


15 - 


18 


(nt, exons 1 - 4 of HTPL-L) 


SEQ 


ID 


NO: 


19 - 


22 


(nt, 500 bp genomic amplicons centered about 
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exons 1 - 4 of HTPL-L) 



5.M 



= 



I . f 



10 



U 15 

in 
m 



SEQ 


ID 


NO: 


23 


(nt, 


1000 bp putative promoter of HTPL) 




SEQ 


ID 


NOS : 


: 24 - 


2028 


(nt, 17-mers scanning SEQ ID NO: 7) 




SEQ 


ID 


NOs: 


: 2029 


- 4025 


i (nt, 2 5-mers scanning SEQ ID NO: 7) 




SEQ 


ID 


NOs: 


: 4026 


- 4414 


(nt, 17-mers scanning SEQ ID NO: 11) 




SEQ 


ID 


NOs; 


: 4415 


- 4795 


i (nt, 25-mers scanning SEQ ID NO: 11) 




SEQ 


ID 


NO: 


4796 


(nt, 


(nt 1 - 2021) portion of HTPL-S) 




SEQ 


ID 


NO: 


4797 


(nt, 


5' UT portion of SEQ ID NO: 4796) 




SEQ 


ID 


NO: 


4798 


(nt, 


coding region of SEQ ID NO: 4 796) 




SEQ 


ID 


NO: 


4799 


(aa, 


(aa 1 - 648) CDS entirely within SEQ 


ID NO: 












4798) 




SEQ 


ID 


NO: 


4800 


(nt, 


(nt 2637 - 3041) portion of HTPL-S ) 




SEQ 


ID 


NO: 


4801 


(nt, 


primer 62NF3 for cloning of HTPL) 




SEQ 


ID 


NO: 


4802 


(nt, 


primer 62487Rend for cloning of HTPL) 




SEQ 


ID 


NO: 


4803 


(nt, 


primer 62487Pu for RT-PCR analysis of 


HTPL) 


SEQ 


ID 


NO: 


4804 


(nt, 


primer 62487Pd for RT-PCR analysis of 


HTPL) 



Upon confirmation of the exact structure, each of the 
U above -described nucleic acids of confirmed structure is 



Vii 20 recognized to be immediately useful as a HTPL-specif ic probe, 
p For use as labeled nucleic acid probes, the above- 

l " described HTPL nucleic acids are separately labeled by random 

priming. As is well known in the art of molecular biology, 
random priming places the investigator in possession of a near- 
25 complete set of labeled fragments of the template of varying 
length and varying starting nucleotide. 

The labeled probes are used to identify the HTPL gene 
on a Southern blot, and are used to measure expression of HTPL 
mRNA on a northern blot and by RT-PCR, using standard 
30 techniques. 

EXAMPLE 3 

Expression Analysis of HTPL by RT-PCR 

35 The Advantage 2 PCR amplification kit and PCR cDNA of 

different human tissues were obtained from Clontech 
Laboratories Inc. (Palo Alto, CA) . The PCR parameters were 
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set-up as follows, 94°C 15 seconds; 59°C 3 0 seconds; 72°C 4 0 
seconds for 35 cycles. The PCR composition is as follows: 0.5 
ul of cDNA; 2.5 ul of 10X amplification buffer; 0 . 5ul dNTP (10 
M) ; 1 ul of primer pairs {10 M each; SEQ ID NO: 4 803 and SEQ ID 
NO: 4804); 0.5 ul of Advantage polymerase Mix in 25 ul reaction 
mixture. The amplified DNA products were resolved in 1.2 % 
agarose gel in TAE buffer. The gel was scanned using Typhoon™ 
Imaging System (Amersham Biosciences) . HTPL is strongly 
expressed in testis, weakly expressed in skeletal muscle, bone 
marrow, lung, liver, kidney, colon and placenta, while hardly 
expressed in brain, heart and uterus {FIG. 5) . 

EXAMPLE 4 
Production of HTPL Protein 

The full length HTPL cDNA clone is cloned into the 
mammalian expression vector pcDNA3 . l/HISA (Invitrogen, 
Carlsbad, CA, USA) , transfected into C0S7 cells, transf ectants 
selected with G418, and protein expression in transf ectants 

TM 

confirmed by detection of the anti-Xpress epitope according 
to manufacturer's instructions. Protein is purified using 
immobilized metal affinity chromatography and vector-encoded 
protein sequence is then removed with enterokinase , per 
manufacturer's instructions, followed by gel filtration and/or 
HPLC. 

Following epitope tag removal, HTPL protein is 
present at a concentration of at least 70%, measured on a 
weight basis with respect to total protein (i.e., w/w) , and is 
free of acrylamide monomers, bis acrylamide monomers, 
polyacrylamide and ampholytes. Further HPLC purification 
provides HTPL protein at a concentration of at least 95%, 
measured on a weight basis with respect to total protein (i.e., 
w/w) . 
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EXAMPLE 5 
Production of Anti-HTPL Antibody 

Purified proteins prepared as in Example 4 are 
conjugated to carrier proteins and used to prepare murine 
monoclonal antibodies by standard techniques. Initial 
screening with the unconjugated purified proteins, followed by 
competitive inhibition screening using peptide fragments of the 
HTPL, identifies monoclonal antibodies with specificity for 
HTPL . 

EXAMPLE 6 
Use of HTPL Probes and Antibodies 
for Diagnosis of Male Infertility and Tumor 

After informed consent is obtained, samples are drawn 
from testis or other disease tissue or cells and tested for 
HTPL mRNA levels by standard techniques and tested additionally 
for HTPL protein levels using anti- HTPL antibodies in a 
standard ELISA. 

EXAMPLE 7 

Use of HTPL Nucleic Acids, Proteins, 
and Antibodies in Therapy 

Once over-expression of HTPL is detected in patients, 
HTPL antisense RNA or HTPL-specif ic antibody is introduced into 
disease cells to reduce the amount of the protein. 

Once mutations of HTPL have been detected in 
patients, normal HTPL is reintroduced into the patient's 
disease cells by introduction of expression vectors that drive 
HTPL expression or by introducing HTPL proteins into cells. 
Antibodies for the mutated forms of HTPL are used to block the 
function of the abnormal forms of the protein. 
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EXAMPLE 8 
HTPL Disease Associations 



Diseases that map to the HTPL chromosomal region are 



5 


shown in 


Table 3 : 






Table 3 ; 
Diseases 


mapped to human chromosome 10pl2.1 


(HTPL region) . 


M 

a 


mim__num 


Disease 


chromosomal 
location 


I? i 

□ 

Si 


601188 


PROSTATE ADENOCARCINOMA 1 


lOpter-qll 


603188 


SUSCEPTIBILITY TO OBESITY 


lOp 


IF! 
ffl 

5 


604401 


ARRHYTHMOGENIC RIGHT VENTRICULAR 
CARDIOMYOPATHY 6 


10pl4-pl2 


Ui 


600964 


REFSUM DISEASE WITH INCREASED 
PI PECOLI CACIDEMIA 


lOpter-pll . 2 




602432 


OPEN ANGLE GLAUCOMA 1 


Chr.10 



ru 

Prostate cancer is the second leading cause of male 

10 cancer deaths in the United States 

(http://www.ncbi.nlm.nih.gov/entrez/). A genetic locus, 
designated prostate adenocarcinoma 1 (PACl) , is involved in 
tumor suppression of human prostate carcinoma. Sanchez et al . , 
Proc. Nat. Acad. Sci . 93:2551-2556 (1996). HTPL is a candidate 

15 gene for PACl as it maps to the same chromosomal region. 
Alternatively, mutations of HTPL could also lead to other 
diseases, such as those listed in Table 3. 

All patents, patent publications, and other published 
20 references mentioned herein are hereby incorporated by 

reference in their entireties as if each had been individually 
and specifically incorporated by reference herein. While 

145 



# 

preferred illustrative embodiments of the present invention are 
described, one skilled in the art will appreciate that the 
present invention can be practiced by other than the described 
embodiments, which are presented for purposes of illustration 
only and not by way of limitation. The present invention is 
limited only by the claims that follow. 
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